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a\ 

^ (57) Abstract The present imrention relates to host ceUs having modified lipid-linked oligosaccharides which may be modified 
^ fmtherby heterologous expression of a set of glycosyltiansferases, sugar transpoiters and mannosidases to become host-strains for 

the production of mamnoaHan, e.g., human therapeutic glycoproteins. The process provides an engineered host cell which can be 
2 used to express and target any desirable gene(s) involved in ^ycosylation. Host cdls with modified lipid-linked oligosaccharides 

are created or selected N-glycans made in the engineered host cells have a GlcNAcMansGlcNAci core structure whidi may then be 
^ modified fiirther hy heterologous expression of one or more enzymes, e.g., glycosyl-transferases, sugar transporters and maimosi- 
^ dases. to yield human-like glycoproteins. For the production of therapeutic proteins, this method may be ad^ted to engineer cell 

lines in which any desired glycosylation structure may be obtained. 
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METHODS TO ENGINEER MAMMALIAN-TYPE CARBOHYDRATE 

STRUCTURES 

CROSS-REFERENCE TO RELATED APPLICATIONS 

[00011 This application claims priority to U. S. provisional application Ser. No. 
5 60/344,169, Dec. 27, 2001, which is incorporated by reference herein in its 
entirety. 

FIELD OF THE INVENTION 

[0002] The present invention gaaerally relates to modifying the glycosylation 
structures of recombinant proteins expressed in fimgi or other lower eukaryotes, to 
10 more closely resCTible the glycosylation of proteins of higher mammals, in 
particular humans. 

BACKGROUND OF THE INVENTION 

[0003] After DNA is transcribed and translated into a protein, further post 
translational processing involves the attachment of sugar residues, a process known 
as glycosylation. Different organisms produce different glycosylation enzymes 
(glycosyltransferases and glycosidases), and have different substrates (nucleotide 
sugars) available, so that the glycosylation patterns as well as composition of the 
individual oligosaccharides, even of one and the same protein, will be different 
depending on the host system in which the particular protein is being expressed. 
Bacteria typically do not glycosylate proteins, and if so only in a very unspecific 
manner (Moens, 1997). Lower eukaryotes such as filamentous fungi and yeast add 
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piiinarily mannose and mamosylphosphate sugars, whereas insect cells such as 
SO cells glycosjdate proteins in yet another way. See for example (Bretthauer, 
1999; Martinet, 1998; Weikert, 1999; Malissard, 2000; larvis, 1998; and Takeuchi, 
1997). 

5 [00041 Synthesis of a mammalian-type oligosaccharide structure consists of a 
series of reactions in the course of which sugar-residues are added and removed 
while the protein moves along the secretory pathway in the host organism. The 
enzymes which reside along the glycosylation pathway of the host oiganism or cell 
detemiine what the resulting glycosylation pattans of secreted proteins. 

10 Unfortunately, the resulting glycosylation pattern of protems expressed in lower 
eukaryotic host cells differs substantially from the glycosylation found in higher 
eukaryotes such as humans and other mammals (Bretthauer, 1999). Moreover, the 
vastly different ^ycosyladon pattem has, in some cases, been shown to increase 
the immunogenicity of these proteins in humans and reduce their half-Ufe 

15 CTakeuchi, 1997). It would be desirable to produce human-like glycoproteins in 
non-human host cells, especially Iowct eukaryotic cells. 
[0005] The early steps of human gjycosylation can be divided into at least two 
different phases: (i) lipid-linked Glc3Man9GlcNAc2 oUgosaccharides are assembled 
by a sequential set of reactions at the membrane of the endoplasmic reticulum (ER) 

20 and (ii) the transfer of this oligosaccharide from the lipid anchor dolichyl 

pyrophosphate onto de novo synthesized protein. The site of the specific transfer is 
dejHned by an asparagine (Asn) residue in the sequence Asn-Xaa-Ser/Thr (see Kg. 
1), where Xaa can be any amino acid except proline (Gavel, 1990). Further 
processing by glucosidases and mannosidases occurs in tiie ER before the nascent 

25 glycoprotein is transferred to the early Golgi apparatus, where additional mannose 
residues are removed by Golgi specific alpha (a)-l,2-mannosidases. Processing 
continues as the protein proceeds through the Golgi. hi the medial Golgi, a 
number of modifying enzymes, including N-acetylglucosaminyltransferases (GnT 
I, GnT n, GnT m, GnT IV GnT V GnT VI), mannosidase n and 

30 fucosyltransferases, add and remove specific sugar residues (see, e.g,. Bigs. 2 and 
3). Finally, in the trans-Golgi, galactosyltranferases and sialyltransferases produce 
a glycoprotein stracture that is released from the Golgi. It is this structure. 



wo 03/056914 



PCT/DS02/41510 



characterized by bi-, tri- and tetra-antennary structures, containing galactose, 
fiicose, N-acetylglucosamine and a high degree of terminal sialic acid, that gives 
glycoproteins Uieir human characteristics. 

[00061 In nearly all eukaryotes, glycoproteins are derived from the common core 

5 oligosaccharide precursor Glc3Man9GlcNAc2-PP-Dol, where PP-Dol stands for 
dolichol-pyrophosphate (Fig. 1). Within the endoplasmic reticulum, synthesis and 
processing of dolichol pyrophosphate bound oligosaccharides are identical 
betwerai all known eukaryotes. However, further processing of the core 
oligosaccharide by yeast, once it has been transferred to a peptide leaving the ER 

10 and entering the Golgi, differs significantly from humans as it moves along the 
secretory pathway and involves the addition of several mannose sugars. 
[0007] In yeast, these stq)S are catalyzed by Golgi residing 
manno syltransferases, hke Ochlp, Mntlp and Mnnlp, which sequentially add 
mannose sugars to the core oligosaccharide. The resulting structure is undesirable 

15 for the production of humanoid proteins and it is thus desirable to reduce or 
ehminate mannosyltransfecase activity. Mutants of S. cerevisiae, deficient in 
mannosyltransfCTase activity (for example ochl or mnn9 mutants) have been 
shown to be non-lethal and display a reduced mannose content in the 
oligosacharide of yeast glycoproteins. Other oligosacharide processing razymes, 

20 such as mannosylphophate transferase may also have to be eliminated depmding 
on the host's particular endogenous glycosylation pattern. 
Lipid-Linked Oligosaccharide Precursors 

[0008] Of particular interest for this invention are the early steps of N- 
glycosylation (Figs. 1 and 2). The study of alg (asparagine-hnked glycosylation) 
25 mutants defective in the biosynthesis of the Glc3Man9GlcNAc2-PP-Dol has helped 
to elucidate the initial steps of N-glycosylation. 

[0009] The ALG3 gene of cerevisiae has been succesfiilly cloned and knocked 
out by deletion (Aebi, 1996). ALG3 has been shown to encode the enzyme Dol-P- 
Man:Man5GlcNAc2-PP-Dol Mannosyltransferase, which is mvolved in the first 
30 Dol-P-Man dependent mannosylation step &om Man5GlcNAc2-PP"Dol to 

Man6GlcNAc2-PP-Dol at the luminal side of the ER (Sharma, 2001) (Rgs 1 and 
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2). S.cere\nsiae cells harboring a leaky alg3-l mutation accmmilate 
Man5GlcNAc2-PP-Dol (structure I) CHuflfaker, 1983). 

Structure I: Man5GlcNAc2 

5 ^ ™ a-l^-Mannose 

g ^ a-l,6-Mannose 

^ Am ES3 a-l,3-Mannose 

^ P-l,4-Mannose 

^ 3-1,4-GlcNAc 

^ GlcNAc 

10 MansGlcNAcz (Structure I) and Man8GlcNAc2 accumulate in total cell 

mannoprotein of an ochl mnnl alg3 mutant(Nakanishi-Shindo, 1993). This 
S.cerevisiae ochl, mnnl, alg3 mutant was shown to be viable, but temperature- 
sensitive, and to lack a-1,6 polymannose outer chains. 

[001 0] In another study, secretory proteins expressed in a strain deleted for alg 3 

1 5 (Aalg3 background) were studied for their resistance to Endo-p-N- 

acetylglucosaminidase H (Endo H) (Aebi, 1996). Previous observations have 
indicated that only fliose oligosaccharides larger than Man5GlcNAc2 are 
susceptible to cleavage by Endo H (Hubbard, 1980). In the alg3-l phenotype, 
some glycoforms were sensitive to Endo H cleavage, confinaing its leakiness, 

20 whereas in the Aalg3 mutant all glycoforms appeared to be resistant and of the 
Mans-type (Aebi, 1996), suggesting a tight phenotype and transfer of 
Man5GlcNAc2 oligosaccharide structures onto the nascent polypeptide chain. No 
obvious phenotype was connected with the inactivation of &!dALG3 gene (Aebi, 
1996). Secreted exogjuconase produced in a Saccharomyces cerevisiae alg3 

25 mutant was found to contain between 35-44% underglycosylated and 

unglycosylated fonns and only about 50% of the transferred oligosaccharides 
remained resistant to Endo H treatment (Cueva, 1996). Exogjucanase (Exg), an 
mzyme that contains two potential N-glycosylation sites at Asnies and Asn 325> was 
analyzed in more detail. For Exg molecules that received two oligosaccharides it 

30 was shown that the jSrst N-glycosylation site (Asni^s) was enriched in truncated 
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residues, whereas the second (Asn325) was enriched in regular oligosaccharides. 
35-44% of secreted exoglucanase was nan- or underglycosylated and about 73 - 78 
% of all available N-glyoosylation sites were occiq)ied with either truncated or 
regular oligosaccharides (Cueva, 1996). 
S Transfer of Glucosylated Lipid-Lmked Oligosaccharides 

[00111 Evidence suggests that, in mammalian cells, only glucosylated lipid- 
lioked oligosaccharides are transferred to nascent proteins (Turco, 1977), while in 
yeast alg5, alg6 and dpgl mutants, nonglucosylated oligosaccharideds can be 
transferred (Ballou, 1986; Runge, 1984). hi a Saccharomyces cerevisiae algS 

10 mutant, underglucosylated GlcManpGlcNAca is transferred (Runge, 1986). 

Verostek and co-woricers studied an alg3, seel 8, glsl mutant and proposed that 
glucosylation of a MansGlcNAca structure (Structure I, above) is relatively slow in 
comparison to glucosylation of a Upid-linked Man9 structure. Ih addition, the 
transfer of this Man5GlcNAc2 structure to protein q)pears to be about 5-fold more 

15 efficient than the glucosylation to Glc3Man5GlcNAc2. The decreased rate of 

Man5GlcNAc2 glucosylation in combination with the comparatively faster rate of 
Mans structure transfer onto nascent protdn is believed to be the cause of the 
observed accumulation of nongJucos>4ated Mans structures in alg3 mutant yeast 
(Verostek-a, 1993; Verostek-b, 1993). 

20 [0012] Studies preceding the above work did not reveal any lipid-linked 
glucosylated oUgosaccharides (Orlean, 1990; Hufiaker, 1983) allowing the 
conclusion that glucosylated oligosaccharides are transferred at a much higher rate 
tfian their nonglucosylated counterparts and thus are much harder to isolate. 
Recent work has allowed the creation and study of yeast strains with un- and 

25 hypoglucosylated oligosaccharides and has further confirmed the importance of the 
addition of glucose to the antemia of Upid-linked oUgosaccharides for substrate 
recognition by the oUgosaccharyltransferase complex (Reiss, 1996; Stagljar, 1994; 
Burda, 1998). The decreased degree of glucosylation of the Upid-linked Mans- 
oUgosaccharides in an alg3 mutant negatively impacts the kinetics of die transfer 

30 of Upid-linked oUgosaccharides onto nascent protein and is beUeved to be the 

cause for the strong underglycosylation of secreted protdns in an alg3 knock-out 
strain (Aebi, 1996), 
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[0013] The assembly of Hie lipid-liriced core oKgosaccharide Man9GlcNAc2 
occurs, as described above, at the membrane of the endoplasmatic reticulmn. The 
additions of three glucose units to the a-l^-antenna of the lipid-linked 
oligosaccharides are the fi^^fll reactions in the oligosaccharide assembly. First an 
5 a-1,3 glucose residue is added followed by another a-1,3 glucose residue and a 
terminal a-1,2 glucose residue. Mutants accumulating dolichoHinked 
Man9GlcNAc2 have been shown to be defective in the ALG6 locus, and Alg6p has 
similarities to AlgSp, the a-l,3-^ucosyltransferase catalyzing the addition of the 
second a-l,3-Iinked glucose (Reiss, 1996). Cells with a defective ALG8 locus 

10 accumulate dolichol-linked GlciMan9GlcNAc2 (Runge, 1986; Stagljar, 1994). The 
ALGIO locus encodes the a-1,2 glucosyltransferase responsible for the addition of 
a single tenninal glucose to Glc2Man9GlcNAc2-PP-Dol (Burda, 1998). 
Sequential Processing of N-glycans by Localized Enzyme Activities 
[0014] Sugar transferases aad mannosidases line the ioner (limunal) surfece of 

15 the ER and Golgi apparatus and thereby provide a "catalytic" surfece that allows 
for the sequential processing of glycoproteins as they proceed through the ER and 
Golgi network In fact the multiple compartments of the cis, medial, and trans 
Golgi and the trans-Golgi Network (TGN), provide the different localities in 
which the ordered sequence of glycosylation reactions can take place. As a 

20 glycoprotein proceeds from synthesis in the ER to full maturation in the late Golgi 
or TGN, it is sequentially exposed to different glycosidases, mannosidases and 
glycosyltransferases such that a specific carbohydrate structure may synthesized. 
Much work has been dedicated to revealing the exact mechanism by which these 
enzymes are retained and anchored to flieir respective organelle. The evolvuig 

25 picture is complex but evidence suggests that, stem region, membrane spanning 
region and cytoplasmic tail individually or in concert direct euzymes to the 
membrane of individual organelles and thereby localize the associated catalytic 
domain to that locus. 

[0015] In some cases tiiese specific interactions were found to fimction across 
30 species. For example the membrane spanning domain of oc2,6-ST firom rats, an 
enzyme known to localize in the trans-Golgi of the animal, was shown to also 
localize a reporter gene (invertase) in the yeast Golgi (Schwientek, 1995). 
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However, the very same membrane spamrmg domain as part of a fuU-length (x2,6 
ST was retained in the ER and not further transported to the Golgi of yeast 
(Krezdom, 1994). A fuU length Gal-Tr from humans was not even sjarihesized in 
yeast, despite demonstrably high transcription levels. On the other hand the 

5 transmembrane region of human the same GalT fused to an invertase reporter was 
able to direct locahzation to the yeast Golgi, albeit it at low production levels. 
Schwientek and co-workers have shown that fusing 28 amino acids of a yeast 
mannosyltransfearase (Mntl), a region containing a cytoplamic tail, a 
transmembrane region and eight amino acids of the stem region, to the catalytic 

10 domain of human GalT are sufficient for Golgi localization of an active GalT. 

Other galactosyitransfisrases appear to rely on interactions with enzymes resident 
in particular organelles since after removal of their transmranbrane region they are 
still able to localize properly. To date there exists no reliable way of predicting 
whether a particular heterologously expressed glycossdtransferase or mannosidase 

15 in a lower eukaryote wiU be (1), sufficiently translated (2), catalyticaUy active or 
(3) located to the proper organeUe within the secretory pathway. Since all three of 
these are necessary to effect glycosylation pattems in lower eukaryotes, a 
systematic scheme to achieve the desired catalytic fimction and proper retention of 
enzymes in the absence of predictive tools, whidi are currently not available, has 

20 been designed. 

Production of Therapeutic Glycoproteins 

[0016] A significant number of proteins isolated from humans or animals are 
post-lranslationaUy modified, with glycosylation beiag one of the most significant 
modifications. An estimated 70% of all therapeutic proteins are glycosylated and 

25 thus cunently rely on a production system (i.e., host cell) that is able to glycosylate 
in a mamiCT similar to humans. To date, most glycoproteins are made in a 
maminfllian host system. SevcTal studies have shown that glycosylation plays an 
important role in detramining the (1) mmranograiicity, (2) pharmacokinetic 
properties, (3) tiafficking, and (4) efficacy of therapeutic proteins. It is thus not 

30 surprising that sdjstantial efforts by the pharmaceutical mdustry have been 

directed at developing processes to obtain glycoproteins that are as "humanoid" or 
"human-like" as possible. This may involve the genetic engineering of such 
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maromalian cells to CThance the degree of sialyiation (i.e., tennmal addition of 
sialic acid) of proteins expressed by the cells, which is known to improve 
phannacoldnetic properties of such proteins. Alternatively one may improve the 
degree of sialyladon by in vitro addition of such sugars using known 
5 glycosyltransferases and their respective nucleotide sugars (e.g., 2,3 
sialyltransferase and CMP-Sialic acid). 

[0017] Future research may reveal the biological and therapeutic significance of 
specific glycoforms, thereby rendering the abihty to produce such specific 
glycofomis desirable. To date, efforts have concentrated on making proteins with 
10 fairly well characterized glycosylation patterns, and expressing a cDNA encoding 
such a protein in one of the following higher eukaryotic protein expression 
systems: 

1. Higher eukaryotes such as Qiinese hamster ovary cells (CHO), 
mouse fibroblast cells and mouse myeloma cells (Werner, 1998); 
15 2. Transgenic ariimalg such as goats, sheq>, mice and others (Dente, 

1988); (Cole, 1994); (McGarvey, 1995); (Bardor, 1999); 

3. Plants (Arabidopsis thaliana, tobacco etc.) (Staub, 2000); 
(McGarvey, 1995); (Bardor, 1999); 

4. Insect cells {Spodoptera frugiperda SO, S£21, Trichoplusia ni, etc., 
20 in combination witit recombinant baculoviruses such as Autographa califomica 

multiple nuclear polj^edrosis virus which infects lepidopteran cells (Altmaim, 
1999). 

[0018] While most higher eukaryotes carry out glycosyiadon reactions fliat are 
similar to those found in humans, recombinant human proteins expressed in the 

25 above mentioned host systems invariably diGFer from their ''natural" human 

counterpart (Raju, 2000). Extensive development work has thus been directed at 
finding ways to improving the *Tiuman character^' of proteins made in these 
expression systems. This includes the optimization of fomentation conditions and 
tiie genetic modification of protein expression hosts by introducing gmes encoding 

30 enzymes involved in the formation of human like glycoforms (W emer, 1998); 
(Wdkert, 1999); (Andersen, 1994); (Yang, 2000). Inherent problenos associated 
with aU mammalian expression systems have not been solved. 
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[0019] FemeDtatioii processes based on mammalian cell culture (e.g., CHO, 
murine, or human cells), for example, tend to be very slow (fomentation times in 
excess of one week are not uncommon), oftea yield low product titers, require 
expensive nutrients and cofectors (e.g., bovine fetal serum), are limited by 

5 programmed cell death (j5)optosis), and often do not enable expression of 

particular therapeutically valuable proteins. More importantly, mammalian cells 
axe susceptible to viruses that have the potential to be human pathogens and 
stringent quaUty controls are required to assure product safety. This is of particular 
concern since many such processes require the addition of complex and 

10 tonperature sensitive media components that are draived firom animals (e.g., 

bovine calf serum), which may carry agraits pathogenic to humans such as bovine 
spongiform encephalopathy (BSE) prions or viruses. Moreover, the production of 
flxersqpeutic compounds is preferably carried out in a well-controlled sterile 
environment An animal fenn, no matter how cleanly kept, does not constitute 

15 such an enviromnent,ttius constituting an additional problem in the use of 
transgaaic animals for manufacturing higji volume thenqpeutic protdns. 
[0020] Most, if not all, currently produced therapeutic glycoproteins are tiierefore 
expressed in mammalian cells and much efifort has beem directed at improving (i.e., 
*1iunianizing'0 the glycosjdation pattern of these recombinant proteins. Changes in 

20 medium composition as well as tiie co-racpression of geanes encoding enzymes 
involved in human glycosylation have been successfully employed (see, for 
example, Weikert, 1999). 

[0021] While recombinant proteins similar to their human counterparts can be 
made in mammalian e^qpression systems, it is currently not possible to make 

25 proteins with a human-like glycosylation pattern in lower eukaryotes (fimgi and 
yeast). Althou^ the core oUgosaccharide structure transferred to a protein in the 
endoplasmic reticulum is basically identical in mammals and Iowa: eukaryotes, 
substantial differences have been found m the subsequent processing reactions 
which occur in in the Golgi ^aratus of fungi and mammals, hi fact, even 

30 amongst different lower eukaryotes there exist a great variety of glycosylation 
strictures. This has prevrated the use of lower eukaryotes as hosts for tiie 
production of recombinant human glycoproteins despite otherwise notable 
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advantages over mammalian expression systems, such as: (1) generally Mgher 
product titers, (2) shorter fermentation times, (3) having an alternative for proteins 
lhat are poorly repressed in mammalian cells, (4) the ability to grow in a 
chemically defined protein firee medium and thus not requiring complex animal 
5 derived media components, (5) and the absence of viral, especially retroviral 
infections of such hosts. 

[0022] Various mefhylotrophic yeasts such as Pichia pastoris, Pichia 
methanolica, and Hansenula polymorpha, have played particularly important roles 
as eukaryptic expression systems because they are able to grow to high cell 
10 densities and secrete large quantities of recombiuant protein. Howeva:, as noted 
above, lower eukaryotes such as yeast do not glycosylate proteins like higjier 
mammals. See for example, Martinet et al (1998) Biotechnol Let Vol. 20. No. 12, 
which discloses the expression of a heterologous mannosidase in the endoplasmic 
reticulum (ER). 

15 [0023] Chfta et al. (1998) have shown that S,cerevisiae can be engineered to 
provide structures ranging firom Man«GlcNAc2 to Man5GlcNAc2 structures, by 
eliminating 1,6 mannosyltransferase {OCHl\ 1,3 mannosyltransferase (MNNl) 
and a regulator of maonosylphosphatetransferase (MNN4) and by targeting the 
catalytic domain of a-l,2-mannosidase I icom Aspergillus saitoi into the ER of 

20 S.cerevisiae using an ER retrieval sequence (Chiba, 1998). However, this attempt 
resulted in little or no production of tihe desired MansGlcNAca, e.g., one that was 
made in vivo and which could function as a substrate for GnTl (the next stq) in 
making hmnan-like ^ycan stractures). Chiba et al. (1998) showed that P. pastoris 
is not inherentiy able to produce useful quantities (greater than 5%) of 

25 GlcNAcTransferase I accepting carbohydrate. 

[0024] Maras and co-workers assert that in T. reesei "sufficient concmtrations of 
acceptor substrate (i.e. MansGlcNAca) are present*', however when trying to 
convert this acceptor substrate to GlcNAcMan5GlcNAc2 in vitro less than 2% were 
converted thereby demonstrating the presence of Man5GlcNAc2 structures that are 

30 not suitable precursors for complex N-glycan formation (Maras, 1997; Maras, 
1999). To date no enabling disclosure exists, that allows for the production of 
commercially relevant quantities of GlcNAcMansGlcNAca in lower eukaryotes. 
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[0025] It is fberefoie an object of flie present invmtion to provide a system and 
methods for biTmaniT^ng glycosylation of recombinant glycoproteins expressed in 
non-biuman host cells. 

5 SUMMARY OF TEE INVEim^^ 

[0026] The present invention relates to host cells such as fungal strains having 
modified lipid-linked oligosaccharides which may be modified fijrfher by 
heterologous expression of a set of glycosyltransferases, sugar transporters and 
mannosidases to become host-strains for the production of mamm alian^ e.g., 

10 human therapeutic glycoproteins. A protein production method has been 

developed using (1) a lower eukaryotic host such as a unicellular or filamentous 
fungus, or (2) any non-human eukaryotic organism that has a different 
glycosylation pattern from humans, to modify the glycosylation composition and 
structures of the pioteios made in a host organism C*host cell'') so that they 

15 resemble more closely carbohydrate structures found va human proteins. The 

process allows one to obtain an engincCTed host cell which can be used to express 
and target any desirable geae(s) involved in glycosylation by methods that are well 
established m the scientific literature and generally known to the artisan in the field 
of protein expressioDL As described herein, host cells with modified Upid-hnked 

20 oUgosaccharides are created or selected. N-glycans made in the engineered host 
cells have a GlcNAcMan3GlcNAc2 core structure which may then be modified 
further by heterologous expression of one or more enzymes, e.g., glycosyl- 
transferases, sugar transporters and mannosidases, to yield human-like 
glycoproteins. For the production of therapeutic proteins, this method may be 

25 adapted to engineer cell lines in which any desired glycosylation structure may be 
obtained. 

BRIEF DESCRIPTION OF TEDB DRAWINGS 

[00271 Figure 1 is a schematic of the structure of the doUchyl pyrophosphate- 
30 linked ohgosaccharide. 
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[0028] Figure 2 is a schematic of the generation of GlcNAc2Man3GlcNAc2 N- 
glycans &om fungal host cells which are deficiCTt in alg3, alg9 or alg 12 activities. 
[0029] Figure 3 is a schematic of processing reactions required to produce 
mammalian-type oligosaccharide structures in a fimgal host cell with an algS^ ochl 
5 genotype. 

[0030] Figure 4 shows S, cerevisiae Alg3 Sequence Comparisons (Blast) 
[0031] Figure 5 shows S. cerevisiae Alg 3 and Alg 3p Sequences 
[0032] Figure 6 shows P.- pastoris Alg 3 and Alg 3p Sequences 
[0033] Figure 7 shows P. pastoris Alg 3 Sequence Comparisons (Blast) 

10 [0034] Figure 8 shows K. lactis Alg 3 and Alg 3p Sequences 

[0035] Figure 9 shows K. lactis Alg 3 Sequence Comparisons (Blast) 
[0036] Figure 10 shows S, cerevisiae Alg 9 and Alg 9p Sequences 
[0037] F^;ure 11 shows P. pastoris Alg 9 and Alg 9p Sequences 
[0038] Figure 12 shows P. pastoris Alg 9 Sequmce Comparisons (Blast) 

15 [0039] Figure 13 diows S. cerevisiae Alg 12 and Alg 12p Sequences 
[0040] Figure 14 shows P. pastoris Alg 12 and Alg 12p Sequences 
[0041] Figure 15 shows P. pastoris Alg 12 Sequence Comparisons (Blast) 
[0042] Figure 16 is a MALDI-TOF-MS analysis of N-glycans isolated from a 
kringle 3 glycoprotein produced in a P. pastoris showing that the predominant N- 

20 glycan is GlcNAcMan5GlcNAc2. 

[0043] Figure 17 is a MALDI-TOF-MS analysis of N-glycans isolated from a 
kringle 3 glycoprotein produced in a P pastoris (Fig. 16) treated with jS-N- 
hexosaminidase (peak corresponding to Man5GlcNAc2) to conJSrm that the 
predominant N-glycan of Fig. 16 is GlcNAcMansGlcNAca. 

25 [0044] Figure 18 is a MALDI-TOF-MS analysis of N-glycans isolated from a 
kringle 3 glycoprotein produced in a P pastoris algS deletion mutant showing that 
the predominant N-glycans are GlcNAcMansGlcNAca and GlcNAcMan4GlcNAc2. 
[0045] Figure 19 is a MALDI-TOF-MS analysis of N-glycans isolated from a 
kringle 3 glycoprotein produced in a Ppastoris alg3 deletion mutant treated with 

30 cd^ mannosidase, showing that the GlcNAcMan4GlcNAc2 of Fig. 18 is converted 
to GlcNAcMan3GlcNAc2. 
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[0046J Figure 20 is a MALDI-TOF-MS analysis of N-glycans of Fig. 19 treated 
with iS-N-hexosainmidase ^eak correq>onding to MansGlcNAci) to confirm that 
tteN-glycanof Fig. 19 is GlcNAcMan3GlcNAc2. 

[0047] Figure 21 is a MAIJDI-TOF-MS analysis of N-glycans isolated from a 
5 kcingle 3 glycoprotein prodxiced in a P.pastoris alg3 deletion mutant treated with 
al,2 mannosidase and GnTII, showing that the GlcNAcMansGlcNAca of Fig. 19 is 
converted to GlcNAc2Man3GlcNAc2. 

[0048] Figure 22 is a MALDI-TOF-MS analysis of N-glycans of Fig. 21 treated 
with iS-N-hexosaminidase (peak corresponding to Man3GlcNAc2) to confirm that 

10 the N-glycan of Fig. 21 is GlcNAc2Man3GlcNAc2. 

[0049] Figure 23 is a MALDI-TOF-MS analysis of N-glycans isolated from a 
kringle 3 glycoprotein produced in a P.pastoris alg3 deletion mutant treated with 
al,2 mannosidase and Gnm in the presence of UDP-galactose and |?1,4- 
galactos}dtransf6ras6, showing that the GlcNAc2Man3GlcNAc2 of Fig. 21 is 

15 converted to Gal2GlcNAc2Man3GlcNAc2. 

[0050] Figure 24 is a MALDI-TOF-MS analysis of N-g^ycans isolated from a 
kringle 3 glycoprotein produced in a P.pastoris algS deletion mutant treated with 
al,2 maimosidase and GnTII in the presence of UDP-galactose and j81,4- 
galactosyltransferase, and further treated with CMP-N-acetylneuraminic acid and 

20 sialyltransferase, showing that the Gal2GlcNAc2Man3GlcNAc2 is converted to 
NANA2Gal2GlcNAc2Man3GlcNAc2. 

[0051] Figure 25 shows S. cerevisiae Alg6 and Alg 6p Sequences 
[0052] Figure 26 shows P. pastoris Alg6 and Alg 6p Sequences 
[0053] Figure 27 shows P. pastoris Alg 6 Sequence Comparisons (Blast) 

25 [0054] Figure 28shows Klactis Alg6 and Alg 6p Sequences 

[0055] Figure 29 shows KLlactis Alg 6 Sequence Comparisons (Blast) 
[0056] Figure 30 Model of an IgG immunoglobulin. Heavy chain and light 
chain can be, based on similar secondary and tertiary structure, subdivided into 
domains. The two heavy chains (domains Vh, ChI, Ch2 and Ch3) are linked 

30 through three disulfide bridges. The light chains (domains Vl and Cl) are linked by 
another disulfide bridge to the ChI portion of the heavy cliain and, together with 
the ChI and Vh fragments, make up the Fab region. Antigens bind to the terminal 
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portion of the Fab region. Effector-functions, such as Fc-gamma-Receptor binding 
have been localized to the Ch2 domain, jnst downstream of the hinge region and 
are mfluenced by N-glycosylation of asparagme 297 in the heavy chauL 
[00571 Figure 31 Schematic overview of a modular IgGl expression vector, 
5 [0058] Figure 32 shows M. musculis GwriZTNucleic Acid And Amino Acid 
Sequences 

[0059] Figure 33 shows H. sapiens GnT/FNucleic Acid And Anoino Acid 
Sequences 

[0060] Figure 34 shows M musculis GnT FNucleic Acid And Amino Acid 
10 Sequences 

DETAILED DESCRIPTION OF THE INVENTION 

[0061] Unless otherwise defined herein, scientific and technical terais used in 
connection with the present invention shall have the meanings that are commonly 

15 understood by fliose of ordinary skill in the art. Further, unless oflierwise required 
by context, singular terms shall raclude pluralities and plural terms shall include 
the singular. The methods and techniques of the present invention are generally 
performed according to conveotional methods well known in the art Generally, 
nomenclatures used in cortaection with, and techniques of biochemistry, 

20 enzymology, molecular and cellular biology, microbiology, genetics and protein 
and nucleic acid chemistry and hybridization described herein are those well 
known and commonly used in tbe art. The methods and techniques of the present 
invention are generally performed according to conventional methods well known 
in the art and as described in various general and more specific references that are 

25 cited and discussed throu^out the present specification unless otherwise radicated. 
See, e.g., Samhrook et al. Molecular Cloning: A Laboratory Manual, 2d ed., Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989); Ausubel et al., 
Current Protocols in Molecular Biology, Greaie Publishing Associates (1992, and 
Supplements to 2002); Harlow and Lane Antibodies: A Laboratory Manual Cold 

30 Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1990); Introduction to 
Glycobiology, Maureen E. Taylor, Kurt Drickamer, Oxford Univ. Press (2003); 
Worthington Enzyme Manual, Worfhington Biochemical Corp. Freehold, NJ; 
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Handbook of Biochemistry: Section A Protems Vol 1 1976 CRC Press; Handbook 
of Biochemistry: Section A Proteins Vol n 1976 CRC Press; Essentials of 
Glycobiology, Cold Spring Harbor Laboratory Press (1999). Hie nomenclatures 
used in connection with, and the laboratory procedures and techniques of; 
5 biochemistry and molecular biology described herein are those weU known and 
commonly used in the art 

[0062] All publications, patents and other reforences mentioned herein are 
incorporated by reference. 

[00631 The following temis, unless otherwise indicated, shall be understood to 
10 have the following meanings: 

[00641 As used herein, the term "N-glycan" refers to an N-linked 
oUgosaccharide, e.g., one that is attached by an asparagine-N-acetylglucosamine 
linkage to an asparagine residue of a polypeptide. N-glycans have a common 
pentasaccharide core of MangGlcNAcj ("Man" refers to mannose; "Gte" refers to 
15 glucose; and "NAc" refers to N-acetyl; dcNAc refers to N-acetylglucosamine). 
N-glycans differ with respect to the numbor of branches (antemiae) comprising 
peripheral sugars (e.g., fiicose and siaUc add) that are added to the Man3GlcNAc2 
CMan3") core structure. N-glycans are classified according to their branched 
constituents (e.g., high mannose, complex or hybrid). A "higji mannose" type N- 
20 glycan has five or more mannose residues. A "complex" type N-glycan typically 
has at least one GlcNAc attached to the 1 ,3 mannose arm and at least one GlcNAc 
attached to the 1,6 mannose ami of a "trimannose" core. The 'trimannose core" is 
the pentasaccharide core having a ManS structure. Complex N-glycans may also 
have galactose ("Gal") residues tiiat are optionally modified with siaHc acid or 
25 derivatives ("NeuAc", where "Neu" refers to neuraminic acid and "Ac" refers to 
acetyl). Complex N-glycans may also have intrachain substitutions comprising 
"bisecting" GlcNAc and core fucose ("Fuc"). A "hybrid" N-glycan has at least 
one GlcNAc on the terminal of flie 1,3 mannose arm of tiie trimannose core and 
zero or more mannoses on the 1,6 mannose arm of the trimannose core. 
30 [0065J Abbreviations used herein are of common usage in the art, see, e.g., 

abbreviations of sugars, above. Other common abbreviations mclude 'TNGase", 
which refers to peptide N-glycosidase F (EC 3.2.2.18); "GlcNAc Tr (I - ID)", 
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which refers to one of three N-acetylghicosaminyltransferase enzymes; **NANA" 
refers to N-acetyhieuramimc acid 

[0066] As used herem, the term "seoetion pathway"' refers to the assembly line 
of various glycosylation enzymes to which a lipid-linked oligosaccharide precursor 
5 and an N-glycan substrate are sequentially exposed, following the molecular flow 
of a nascent polypeptide chain from the cytoplasm to the endoplasmic reticulum 
(ER) and the compartments of the Golgi apparatus. Enzymes are said to be 
localized along this pathway. An enzyme X that acts on a lipid-linked glycan or an 
N-glycan before enzyme Y is said to be or to act 'hipstream" to enzyme Y; 
10 similarly, enzyme Y is or acts "downstream" from enzyme X. 

[0067] As used herein, the term "alg X activity" refers to the enzymatic activity 
encoded by the "alg X** gene, and to an enzyme having that enzymatic activity 
encoded by a homologous gene or gene product (see below) or by an unrelated 
gene or gene product 

15 [0068] As used herem, the term "antibody** refers to a fiiU antibody (consisting 
of two heavy chains and two li^t chains) or a fragment thereof Such fragments 
include, but are not limited to, those produced by digestion with various proteases, 
those produced by chemical cleavage and/or chemical dissociation, and those 
produced recombinantly, so long as the fragment remains enable of specific 

20 binding to an antigen. Among these fragments are Fab, Fab', F(ab*)2, and single 
chain Fv (scFv) fragments, Witiiin the scope of the tenn "antibody** are also 
antibodies that have been modified in sequence, but remain capable of specific 
binding to an antigen. Example of modified antibodies are interspecies chimeric 
and humanized antibodies; antibody ftisions; and heteromeric antibody complexes, 

25 such as diabodies (bispecific antibodies), single-chain diabodies, and intrabodies 
(see, e.g., Marasco (ed.). Intracellular Antibodies: Research and Disease 
AppUcations, Springer-Verlag New York, Inc. (1998) (ISBN: 3540641513), the 
disclosure of which is incorporated herein by reference in its entirety). 
[0069] As used herein, the term '^mutation" refers to any change in the nucleic 

30 acid or anxino acid sequence of a gene product, e.g., of a glycosylation-related 
ens^yme. 
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[0070] The term '^jlynucleotide" or "micldc acid molecule" refos to a 
polymeric form of micleotides of at least 10 bases in length. The temi includes 
DNA molecules (e.g., cDNA or genomic or synthetic DNA) and RNA molecules 
(e.g., mRNA or synflietic RNA), as well as analogs of DNA or RNA containing 

5 non-natural nucleotide analogs, non-native intemucleoside bonds, or both The 
nucleic acid can be in any topological confonnatiorL For instance, the nucleic acid 
can be smgle-stranded, double-stranded, triple-stranded, quadruplexed, partially 
double-stranded, branched, hairpinned, circular, or in a padlocked conformation. 
The term includes single and double stranded forms of DNA. 

10 [0071] Unless otherwise indicated, a "nucleic acid comprising SEQ ID NO:X" 
refers to a nucleic acid, at least a portion of which has either (i) the sequence of 
SEQ ID NO:X, or (ii) a sequence complementary to SEQ ID NO:X. The choice 
between the two is dictated by the context For instance, if the nucleic acid is used 
as aprobe, the choice between tiie two is dictated by tiie requirement that the probe 

15 be complementary to the desired target 

[0072] An ^Isolated" or "substantially pure" nucleic acid or polynucleotide (e.g., 
an RNA, DNA or a mixed polymer) is one which is substantially separated from 
other cellular components that naturally accompany the native polynucleotide in its 
natural host cell, e.g., ribosomes, polymerases, and gmomic sequences with which 

20 it is naturally associated. The terai embraces a nucleic acid or polynucleotide that 
(1) has been removed from its naturally occurring environment, (2) is not 
associated with aU or a portion of a polynucleotide in which the "isolated 
polynucleotide" is found in nature, (3) is operatively linked to a polynucleotide 
which it is not linked to in nature, or (4) does not occur in nature. The term 

25 "isolated" or "substantially pure" also can be used in reference to recombinant or 
cloned DNA isolates, chemically syntiiesized polynucleotide analogs, or 
polynucleotide analogs that are biologically synthesized by heterologous systems. 
[0073] However, "isolated" does not necessarily require that the nucleic acid or 
polynucleotide so described has itself been physically removed from its native 

30 environment For instance, an endogenous nucleic acid sequence in the genome of 
an organism is deemed "isolated" herein if a heterologous sequence (i,e-, a 
sequence that is not naturally adjacent to this ©adogenous nucleic acid sequence) is 
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placed adjacCTt to fte endogenous nucleic acid sequence, such that titie expression 
of this endogenous nucleic acid sequence is altered By way of example, a non- 
native promoter sequence can be substituted (e.g., by homologous recombination) 
for the native promoter of a gene in the genome of a human cell, such that this 
5 gene has an altered e5q)ression pattern. This gene would now become "isolated" 
because it is separated from at least some of the sequences that naturally flank it 
[0074] A nucleic acid is also considered 'Isolated" if it contains any 
modifications that do not naturally occur to the corresponding nucleic acid in a 
genome. For instance, an endogenous coding sequence is considered "isolated" if 

10 it contains an insertion, deletion or a point mutation introduced artificially, e.g., by 
human intervention. An "isolated nucleic acid" also includes a nucleic acid 
integrated into a host cell chromosome at a heterologous site, a nucleic acid 
construct present as an episome. Moreover, an "isolated nucleic acid" can be 
substantially free of other cellular material, or substantially firee of culture medium 

15 when produced by recombinant techniques, or substantially firee of chemical 
precursors or other chemicals when chemically synthesized. 
[0075] As used hereiui, the phrase "degmerate varianf ' of a reference nucleic 
acid sequence encompasses nucleic acid sequences that can be translated, 
according to the standard genetic code, to provide an amino acid sequence identical 

20 to that translated from the reference nucleic acid sequmce. 

[0076] The term "percent sequence identity" or "identical" in the context of 
nucleic acid sequences refers to the residues in the two sequmces which are the 
same when ahgned for maximum correspondence. The length of sequence identity 
comparison may be over a stretch of at least about nine nucleotides, usually at least 

25 about 20 nucleotides, more usually at least about 24 nucleotides, typically at least 
about 28 nucleotides, more typically at least about 32 nucleotides, and preferably 
at least about 36 or more nucleotides. There are a number of different algorithms 
laiown in flie art which can be used to measure nucleotide sequence identity. For 
ixistance, polynucleotide sequences can be conq)ared using FASTA, Gap or Bestfit, 

30 which are programs in Wisconsin Package Version 10.0, Genetics Computer 
Groiqi (GCG), Madison, Wisconsin. FASTA provides alignments and percent 
sequence idmtity of the regions of the best overlap between the query and search 
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sequences (Pearson, 1990, (herem incoiporated by referaice). For instance, 
percent sequence identity between nucleic add sequences can be detennined using 
FASTA with its default parameters (a word size of 6 and the NOPAM factor for 
the scoring matrix) or using Gap with its defeult parameters as provided in GCG 

5 Vision 6.1, herein incorporated by reference. 

[00771 The temi "substantial homology" or "substantial similarity," when 
referring to a nucleic acid or j&agment thereof indicates that, when optimally 
aligned with ^propriate nucleotide insertions or deletions with another nucleic 
acid (or its complementary strand), there is nucleotide sequence idaitity in at least 

10 about 50%, more preferably 60% of the nucleotide bases, usually at least about 
70%, more usually at least about 80%, preferably at least about 90%, and more 
preferably at least about 95%, 96%, 97%, 98% or 99% of the nucleotide bases, as 
measured by any well-known algorithm of sequence identity, such as FASTA, 
BLAST or Gap, as discussed above. 

15 [0078] Alternatively, substantial homology or siroilarity exists whai a nucleic 
acid or fiagment thereof hybridizes to aaotiier nucleic acid, to a strand of another 
nucleic acid, or to the complementary strand theceo^ under stringent hybridization 
conditions. "Stringent hybridization conditions" and "stringait wash conditions" 
in the contesrt of nucleic acid hybridization experiments dq>end xspon a number of 

20 diffCTent physical parameters. Nucleic add hybridization will be affected by such 
conditions as salt concentration, temperature, solvents, the base con^sition of the 
hybridizing species, length of the complementary regions, and the number of 
nucleotide base mismatches between the hybridizing nucleic acids, as will be 
readily appreciated by those skilled in the ait One having ordinary skill in the art 

25 knows how to vary these parameters to achieve a particular stringency of 
hybridization. 

[0079] In general, "stringent hybridization" is performed at about 25''C below the 
thermal melting point CTnO for ^ spedfic DNA hybrid undo: a particular set of 
conditions. "Stringent washing" is performed at temperatures about 5°C lower 
30 than the Tm for the specific DNA hybrid undo: a particulars^ of conditions. The 
Tn, is the temperature at which 50% of the target sequence hybridizes to a perfectiy 
matdied probe. See Sambrook et al., st^ra, page 9.51, hereby incorporated by 
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reference. For purposes herein, **hi^ stringency conditions" are defined for 
solution phase hybridization as aqueous hybridization (i.e., firee of formamide) in 
6X SSC (where 20X SSC contains 3.0 M NaCl and 0.3 M sodium citrate), 1% SDS 
at 65oC for 8-12 hours, followed Hy two washes in 0.2X SSC, 0.1% SDS at 65oC 
5 for 20 minutes. It wiU be appreciated by the skilled worker that hybridization at 
65°C will occur at different rates d^ending on a number of factors including the 
length and percent identity of the sequmces which are hybridizing. 
[0080] The nucleic acids (also referred to as polynucleotides) of this iavention 
may include both sense and aatisense strands of RNA, cDNA, genomic DNA, and 

10 synthetic forms and mixed polymers of the above. They may be modified 

chemically or biochemically or may contain non-natural or derivatized nucleotide 
bases, as will be readily appreciated by those of skill in the art. Such modifications 
include, for example, labels, metitiylation, substitution of one or more of the 
naturally occurring nucleotides with an analog, intemucleotide modifications such 

15 as uncharged linkages (e.g., methyl phosphonates, phosphotriesters, 

phosphoramidates, carbamates, etc.), charged linkages (e.g., phosphorothioates, 
phosphorodithioates, etc.), pendent moieties (e.g., polypeptides), intercalators (e.g., 
acridine, psoralen, etc.), chelators, alkylators, and modified linkages (e.g., alpha 
anomeric nucleic acids, etc.) Also included are synthetic molecules that mimic 

20 polynucleotides in their ability to bind to a designated sequence via hydrogen 

bondiag and other chemical interactions. Such molecules are known in the art and 
include, for example, those in which peptide linkages substitute for phosphate 
linkages in the backbone of the molecule. 

[0081] The term '^mutated" when appUed to nucleic acid sequences means that 
25 nucleotides in a nucleic acid sequence may be inserted, deleted or changed 

compared to a reference nucleic acid sequence. A singje alteration may be made at 
a locus (a point mutation) or multiple nucleotides may be inserted, deleted or 
changed at a single locus. In addition, one or more alterations may be made at any 
number of loci within a nucleic acid sequence. A nucleic acid sequence may be 
30 mutated by any method known in the art including but not limited to mutagenesis 
techniques such as "error-prone PGR" (a process for performing PGR xmder 
conditions where the copying fidelity of the DNA polymerase is low, such that a 
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higt rate of point mutations is obtained along the entire length of the PGR product 
See, e.g., Leung, D. W., et aL, Technique, 1, pp. 11-15 (1989) and Caldwell, K C. 
& Joyce G. F., PCR Methods Applic,, 2, pp. 28-33 (1992)); and "oUgonucleotide- 
directed mutagenesis" (a process which enables the generation of site-specific 

5 mutations in any cloned DNA segment of interest See, e.g., Reidhaar-Olson, J . F. 
& Sauer, R T., et al., Science, 241, pp. 53-57 (1988)). 
[0082] The term * Vector" as used herein is intended to refer to a nucleic add 
molecule capable of transportmg another nucleic acid to which it has been linked. 
One type of vector is a "plasmid", which refers to a circular double stranded DNA 

10 loop into which additional DNA segments may be Ugated. Other vectors include 
cosmids, bacterial artificial chromosomes (BAG) and yeast artificial chromosomes 
(YAC). Another type of vector is a viral vector, wherein additional DNA segments 
may be ligated into the viral geuome (discussed in more detail below). Certain 
vectors are capable of autonomous replication in a host cell into which fliey are 

15 introduced (e.g., vectors having an origin of replication which fimctions in the host 
cell). Other vectors can be integrated into liie genome of a host cell iqwn 
introduction into the host cell, and are thereby replicated along with the host 
genome. Moreover, certain preferred vectors are capable of dhrecting the 
expression of genes to which they are operatively linked. Such vectors are referred 

20 to herein as "recombinant expression vectors" (or simply, "expression vectors"). 
[00831 "Operatively Unked" expression control sequences refers to a linkage in 
which the expression control sequence is contiguous with the gene of interest to 
control the gene of interest, as well as expression control sequences that act in 
trans or at a distance to control the gene of interest 

25 [0084] The term "expression control sequence" as used hCTem refers to 

polynucleotide sequences which are necessary to affect the expression of codiug 
sequences to which they are operatively linked. Expression control sequences are 
sequences which control the transcription, post-transcriptional events and 
translation of nucleic acid sequmces. Expression control sequences include 

30 appropriate transcription mitiation, tennination, promoter and enhancer sequences; 
efficient RNA procesang signals such as splicing and polyadenylation signals; 
sequences that stabilize cytoplasmic mRNA; sequences that enhance translation 
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eflBdency (e.g., ribosome binding sites); sequmces that enhance protein stability; 
and when desired, sequences that enhance protein searetion. The nature of such 
control sequences differs depaiding vpon the host organism; in prokaryotes, such 
control sequences generally include promoter, ribosomal binding site, and 
5 transcription termination sequence. The tenn "control sequences" is intended to 
include, at a TniTiiTmnn^ all components whose presence is essential for expression, 
and can also include additional components whose presoace is advantageous, for 
example, leader sequences and fusion partner sequences, 
[0085] The temi 'Recombinant host cell" (or simply 'Tiost cell"), as used herein, 

10 is intmded to refer to a cell into which a recombinant vector has been introduced. 
It should be understood that such temis are intmded to refer not only to the 
particular subject cell but to the progeny of such a cell. Because certain 
modifications may occur in succeeding generations due to either mutation or 
environmental influences, such progeny may not, in fact, be identical to the parent 

15 cell, but are still included within the scope of the tran "host cell" as used herein. A 
recombinant host cell may be an isolated cell or cell line grown in culture or may 
be a cell which resides in a living tissue or organism. 

[0086] The term ^'peptide" as used herein refers to a short polypeptide, e.g., one 
that is typically less than about 50 amino acids long and more typically less than 

20 about 30 amino acids long. The term as used herein enconapasses analogs and 
mimetics that mimic structural and thus biological function. 
[0087] The term "polypeptide" encompasses bofli naturally-occurring and non- 
naturally-occurring proteins, and fragments, mutants, derivatives and analogs 
thereof. A polypeptide may be monomeric or polymeric. Further, a polypeptide 

25 may comprise a number of different domains each of which has one or more 
distinct activities. 

[0088] The term "isolated protein" or "isolated polypeptide" is a protein or 
polypeptide that by virtue of its origin or source of derivation (1) is not associated 
with naturally associated components that accompany it in its native state, (2) 
30 when it exists in a purity not found in nature, where purity can be adjudged with 
respect to the presence of other cellular material (e.g., is free of other proteins from 
the same species) (3) is expressed by a cell from a different species, or (4) does not 
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occur in nature (e.g., it is a ftagment of a polypeptide found in nature or it includes 
amino acid analogs or derivatives not found in nature or linkages other than 
standard peptide bonds). Thus, a polypeptide that is chemically synthesized or 
synthesized in a cellular system different from the ceU from which it naturally 

5 originates will be "isolated'* from its naturally associated components. A 
polypeptide or protein may also be rendered substantially free of naturally 
associated components by isolation, using protein purification techniques well 
known in tbe art As thus defined, "isolated" does not necessarily require that liie 
protein, polypeptide, peptide or oUgopeptide so descnT^ed has been physically 

10 ranoved from its native environment. 

[00891 The term "polypeptide firagmrait" as used herein refers to a polypeptide 
that has an amino-tenninal and/or caiboxy-terminal deletion conq)ared to a full- 
length polypeptide. In a preferred embodiment, the polypeptide Augment is a 
contiguous sequence in which the amino acid sequence of tiie fragment is identical 

15 to the corresponding positions in the naturally-occurring sequence. Fragments 
typically are at least 5. 6, 7, 8, 9 or 10 amino adds long, preferably at least 12, 14, 
16 or 18 amino adds long, more preferably at least 20 amino acids long, more 
preferably at least 25, 30, 35, 40 or 45, amino adds, even more preferably at least 
50 or 60 amino acids long, and even more preferably at least 70 amino adds long. 

20 [0090] A "modified derivative" refers to polypeptides or fiagments thereof tiiat 
are substantially homologous in primary stmctinral sequence but which include, 
e.g., in vivo or in vitro chemical and biochemical modificaticais or which 
incorporate ammn acids that are not found in the native polypeptide. Such 
modifications include, for example, acetylation, caiboxylation, phosphorylation, 

25 glycosylation, ubiquitination, labeling, e.g., with radionucUdes, and various 

enzymatic modifications, as will be readily appreciated by those well skilled in tiie 
art A variety of methods for labeling polypeptides and of substituents or labels 
usefial for such purposes are well known in tiie art, and include radioactive isotopes 
sudi as '"l, ^^P, '^S, and ^H, ligands whidi bind to labeled antiligands (e.g., 

30 antibodies), fluoiophores, chemiluminescent agents, enzymes, and antiligands 

whidi can serve as specific binding pair members for a labeled Ugand. The choice 
of label depends on tiie sensitivity require4 ease of conjugation witii the primer. 
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stabiKty requirements, and available instnmentatioiL Methods for labeling 
polypeptides are well known in the ait See Aosubel et aL, 1992, hereby 
incorporated by reference. 

[0091] The tCTm "fusion protein" refers to a polypeptide comprising a 
5 polypeptide or fragment coi5)led to heterologous amino acid sequences. Fusion 
proteins are useful because they can be constructed to contain two or more desired 
functional elements from two or more different proteins. A fusion protein 
comprises at least 10 contiguous amino acids from a polypeptide of interest, more 
preferably at least 20 or 30 amino acids, even more preferably at least 40, 50 or 60 

10 amino acids, yet more preferably at least 75, 100 or 125 amino acids. Fusion 
proteins can be produced recombinantly by constructing a nucleic acid sequence 
which encodes the polypeptide or a fragment thereof in frame with a nucleic acid 
sequence encoding a diflferent protein or peptide and then expressing the fusion 
pioteiiL Alternatively, a fusion protein can be produced chemically by 

15 crosslinking the polypq)tide or a fragment thereof to anoflier protemu 

[0092] The term **non-peptide analog*' refers to a compound with properties that 
are analogous to those of a reference polypeptide. A non-peptide compound may 
also be termed a "peptide mimetic" or a **peptidomknetic". See, e.g., Jones, (1992) 
Amino Acid and Peptide Synthesis, Oxford University Press; Jung, (1997) 

20 Combinatorial Peptide and Nonpeptide Libraries: A Handbook John Wiley; 
Bodansziky et aL, (1993) Peptide Chraiistry-A Practical Textbook, Springer 
Verlag; "Synthetic Peptides: A Uscts Guide", G. A. Grant, Ed, W. H. Freeman and 
Co., 1992; Evans et al. IMed. Chem. 30:1229 (1987); Fauchere, /. Adv. Drug Res. 
15:29 (1986); Veber and Freidinger TINSp392 (1985); and references sited in 

25 each of the above, which are incorporated herein by reference. Such compounds 
are often developed with the aid of computerized molecular modeling. Peptide 
mimetics that are structurally similar to useful peptides of the invention may be 
used to produce an equivalent effect and are therefore envisioned to be part of the 
invention. 

30 [0093] A '^polypeptide mutant" or ''mutein" refers to a polypeptide whose 

sequence contains an insertion, diq)Ucation, deletion, rearrangement or substitution 
of one or more amino acids compared to the amino acid sequence of a native or 
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wild type protein. A mutein may have one or more amino add point substitutions, 
in which a single amino acid at a position has been changed to anolher amino add, 
one or more insertions and/or deletions, in which one or more amino acids are 
inserted or deleted, respectively, in the sequence of tiie natiffally-occuning protdn. 

5 and/or truncations of the amino acid sequence at dtiier or both Ihe amino or 
carboxy teimini. A mutein may have tiie same but preferably has a dififerent 
biological activity compared to the naturally-occurring protein. For instance, a 
mutein may have an increased or decreased neuron or NgR binding activity. Tn a 
preferred embodiment of the present invention, a MAG derivative that is a mutein 

10 (e.g., in MAG Ig-hke domain 5) has decreased neuronal growfli inhibitory activity 
compared to endograious or soluble wild-type MAG. 

[0094] A mutdn has at least 70% overall sequence homology to its wild-type 
comterparL Eveu more preferred are muteins having 80%, 85% or 90% overall 
sequence homology to the wild-type protein. In an even more preferred 
15 embodiment, amutein exhibits 95% sequence identity, even more preferably 97%, 
even more preferably 98% and even more preferably 99% overall sequence 
identity. Sequence homology maybe measured by any common sequence analysis 
algorithm, such as Gap at Bestfit 

[0095] Preferred amino add substitutions are those whidi: (1) reduce 
20 susceptibility to proteolysis, (2) reduce susceptibility to oxidation, (3) alter bindmg 
afBnity for forming protein complexes, (4) alter binding afBnity or enzymatic 
activity, and (5) confer or modify other physicochemical or fimctional properties of 
such analogs. 

[O096] As used herein, the twenty conventional amino acids and then: 
25 abbreviations follow conventional usage. See Immunology - A Synthesis (2""' 

Edition, E.S. Golub and D.R. Gren, Eds., Sinauer Associates, Sunderland, Mass. 
(1991)), which is incorporated herein by referaice. Stereoisomers (e.g., D-amino 
adds) of the twenty conventional amino adds, unnatiiral amino acids such as Or, 
cfrdisubstitited amino adds, N-alkyi amino acids, and other unconventional amino 
30 adds may also be suitable conqjonents for polypeptides of tiie present invention. 
Examples of unconventional amino adds include: 4-hydroxyproline, 
-y^jaiboxyglutamate, €-N,N,N-tiimeaiyUysine, c-N-aoetyllysine, 0-phosphoserine, 
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N-acetylserine, N-foimylmefhionine, 3-inetiiylhistidme, 5-liydroxylysine, 
s-N-methylargmine, and oHusc similar amino acids and imino acids (e.g., 
4-hydroxyproliae). In the polypeptide notation used hereiii,flie left-hand direction 
is the ammn tftrminRi direction and the right hand direction is the carboxy-traninal 

5 direction, in accordance with standard usage and convention. 

[00971 A protein has *Tiomolog/' or is 'Thomologous" to a second protein if the 
nucleic acid sequence that encodes the protein has a similar sequence to the nucleic 
acid sequence that encodes the second protein. Altraiatively, a protein has 
homology to a second protein if the two proteins have "similar" amino acid 

10 sequences. (Thus, the term 'liomologous proteins" is defined to mean that the two 
proteins have similar amino acid sequences). In a preferred embodiment, a 
homologous protein is one that exhibits 60% sequence homology to the wild type 
protein, more preferred is 70% sequence homology. Even more preferred are 
homologous proteins that exhibit 80%, 85% or 90% sequence homology to the 

15 wild type protein. In a yet more preferred anbodiment, a homologous protein 
exhibits 95%, 97%, 98% or 99% sequence identity. As used herein, homology 
between two regions of amino acid sequence (especially with respect to predicted 
structural similarities) is int^reted as implying similarity in functioa 
[0098] When 'Thomologous" is used in reference to proteins or peptides, it is 

20 recognized that residue positions that are not identical often diflFer by conservative 
amino acid substitutions. A "conservative amino acid substitution" is one in which 
an amino acid residue is substituted by anotho: amino acid residue having a side 
chain (R group) with similar chemical properties (e.g., charge or hydrophobicity). 
In general, a conservative amino acid substitution will not substantially change the 

25 functional propaties of a protein. In cases whore two or more amino acid 
sequences differ from each other by conservative substitutions, the pacent 
sequaace identity or degree of homology may be adjusted upwards to correct for 
the conservative nature of the substitution. Means for making this adjustment are 
well known to those of skill m the art (see, e.g., Pearson et al., 1994, herein 

30 incorporated by reference). 

[0099] The following six groups each contain amino acids that are conservative 
substitutions for one another: 1) Serine (S), Threonine (T); 2) Aspartic Acid (D), 
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Glutamic Add (E); 3) Asparagine (N). Glutamine (Q); 4) Argmine (R), Lysine 
(K); 5) Isoleucine (I), Leucine (L), Mefliionine (M), Alanine (A), Valine (V), and 
6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W). 

[0100] Sequence homology for polypeptides, which is also referred to as percent 
5 sequence identity, is typicaUymeasm^ using sequence analysis software. See, 
e.g., the Sequence Analysis Software Package of the Genetics Computer Groiq) 
(GCG), University of Wisconsin Biotechnology Center, 910 University Avenue, 
Madison, Wisconsin 53705. Protein analysis software matches similar sequences 
using measure of homology assigned to various substitutions, deletions and olher 
10 modifications, including conservative amino acid substitutions. For instance, GCG 
contains programs such as "Gap" ^ "Bestfif ' which can be used with default 
parameters to detamine sequence homology or sequence identity between closely 
related polypeptides, such as homologous polypeptides fi-om diEferent species of 
organisms or between a wild type protein and a mutein thereot See. e.g., GCG 
15 Version 6.1. 

(0101] A preferred algoriflmi when conqiaring a inhibitory molecule sequence to 
a database containing a large number of sequences from different organisms is the 
computer program BLAST (Altschul. S JP. et al. (1990) J. Mol Biol 215:403-410; 
Gish and States (1993) Nature Genet V2£6-ni\ Madden, T.L. et aL (1996) Meth. 
20 Emymol. 266:131-141; Altschul, S.F. et al. (1997) Nucleic Adds J2er.25:3389- 
3402; Zhang, J. and Madden, T.L. (1997) Genome Res. 7:649-656), especially 
blastp or tblastn (Altschul et al., 1997). Preferred parameters for BLASTp are: 

E:q>ectation value: 10 (default) 

Filter: seg (default) 

25 Cost to open a gap: 1 1 (defeult) 

Cost to extend a g^: 1 (default 

Max. alignmraits: 100 (defeult) 

Word size: 11 (default) 

No. of descriptions: 100 (defeult) 
30 Penalty Matrix: BLOWSUM62 

(0102] The length of polypeptide sequaaces con^ared for homology will 
gMierally be at least about 16 amnio acid residues, usually at least about 20 
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residues, more usually at least about 24 residues, typically at least about 28 
residues, and preferably more than about 35 residues. When searching a database 
containing sequences from a large number of different org anisms , it is preferable to 
compare amino acid sequences. Database searching using amino acid sequences 
5 can be measured by algorithms other than blastp known m the ait For instance, 
polypeptide sequences can be compared using FASTA, a program in GCG Version 
6,1. FASTA provides alignments and percent sequence identity of the regions of 
the best overlap between the query and search sequences (Pearson, 1990, herein 
incorporated by reference). For example, percent sequence identity between amino 
10 acid sequences can be determined usiag FASTA with its default parameters (a 

word size of 2 and the PAM250 scoring matrix), as provided in GCG Version 6.1, 
herein incorporated by reference. 

[0103] "Specific binding" refers to the ability of two molecules to bind to each . 
other in preference to binding to other molecules iu the environmmt. Typically, 
15 "specific bindmg" discriminates over adventitious binding in a reaction by at least 
two-fold, more typicaUy by at least 10-fold, often at least 100-fold. Typically, the 
aflBnity or avidity of a specific binding reaction is at least about 10-7 M (e.g., at 
least about 10"* M or 10"^ M). 

[0104] The term "region'* as used herein refers to a physically contiguous portion 
20 of flie primary structure of a biomolecule. In the case of proteins, a region is 

defined by a contiguous portion of the amino acid sequence of that protein. 

[0105] The term "domain** as used herein refers to a structure of a biomolecule 

that contributes to a known or suspected function of the biomolecule. Domains 

may be co-extensive with regions or portions thereof; domains may also include 
25 distinct, non-contiguous regions of a biomolecule. Examples of protein domains 

include, but are not limited to, an Ig domain, an extracellular domain, a 

transmembrane domain, and a cytoplasmic domain. 

[0106] As used herein, the term "molecule" means any compound, including, but 
not limited to, a small molecule, peptide, protein, sugar, nucleotide, nucleic acid, 
30 lipid, etc., and such a compound can be natural or synthetic. 

[0107] Unless otherwise defined, all technical and scientific terms used herein 
have the same meaning as commonly undCTstood by one of ordinary skill in the art 



wo 03/056914 PCT/US0Z/41S10 



to which this invesntion pertains. ExeiEplary methods and materials are described 
below, althoB^ methods and materials similar or equivalent to those described 
herein can also be used in the practice of the present invention and will be apparent 
to those of skill in flie ait All pubUcations and other references mentioned herein 

5 are incorporated by reference in their entirety. In case of conflict, the present 
specification, including definitions, will control. The materials, methods, and 
examples are illustrative only and not intended to be limiting. 
[0108] Throughout this specification and claims, the word "comprise" or 
variations such as "comprises" or "comprising", will be understood to imply the 

10 inclusion of a stated integer or group of integers but not the exclusion of any other 
intego: or group of integers. 

Engfaieering or Selecting Hosts With Modified Lipid-Linked OUgosaccharides 
For The Generation of Human-like N-Glycans 

15 [01091 The invention provides a mefliod for producing a human-hke glycoprotein 
in a non-human eukaryotic host celL The mefliod involves making or using a non- 
human eukaryotic host cell diminished or depleted iamalg gene activity (i.e., alg 
activities, including equivalent aizymatic activities in non-fim^ host cells) and 
introducing into the host cell at least one glycosidase activity. In a prrferred 

20 embodiment, the glycosidase activity is introduced by causing expression of one or 
more mannosidase activities wifliin the host cell, for example, by activation of a 
mannosidase activity, or by expression from a nucleic acid molecule of a 
mannosidase activity, in the host cell. 

[0110] In anotiier embodiment, the method involves making or using a host cell 
25 diminished or depleted in die activity of one or more enzymes that transfer a sugar 
residue to the 1,6 ami of lipid-linked oligosaccharide precursors (Fig. 1). A host 
cell of the invention is selected for or is emgineered by mtroducing a mutation in 
one or more of the genes encoding an enzyme tiiat transfers a sugar residue (e.g., 
mannosylates) the 1,6 arm of a lipid-linked oligosaccharide precursor. The sugar 
30 residue is more preferably mannose, is preferably a glucose, GlcNAc, galactose, 
sialic add, fiicose or GlcNAc phosphate residue. In a preferred embodiment, the 
activity of one or more enzymes that mannosylate the 1,6 arm of Upid-linked 
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oUgosaccharidepreCTirsorsisdimi^^ The method may further 

comprise the step of introducing into the host cell at least one glycosidase activity 
(see below), 

[0111] In yet another embodiment, the invention provides a method for 
5 producing a human-like glycoprotein in a non-human host, wherein the 

glycoprotein comprises an N-glycan having at least two GlcNAcs attached to a 
triinannose core stmcture. 

[0112] In each above embodiment, the method is directed to making a host cell 
in which ttie lipid-linked oligosaccharide precursors are enriched in ManxGlcNAca 

10 structures, where X is 3, 4 or 5 (Fig. 2); These structures are transferred in the ER 
of the host cell onto nascent polypeptide chains by an oligosaccharyl-transferase 
and may then be processed by treatment with glycosidases (e.g., a-mannosidases) 
and glycosyltransferases (e.g., GnTl) to produce N-glycans haviug 
GlcNAcManxGlcNAca core structures, whereia X is 3, 4 or 5, and is preferably 3 

15 (Figs. 2 and 3). As shown in Kg. 2, N-glycans having a GlcNAcManxGlcNAc2 
core structure where X is greater than 3 may be converted to 
GlcNAcMan3GlcNAc2, e.g., by treatment with an 05-1,3 and/or oc-1,2-1,3 
mannosidase activity, where applicable. 

[0113] Additional processing of GlcNAcMan3GlcNAc2 by treatment with 
20 glycosyltransferases (e.g., GnTII) produces GlcNAc2Man3GlcNAc2 core structures 
which may then be modified, as desired, e.g., by ex vivo treatment or by 
heterologous expression in the host cell of a set of glycosylation enzymes, 
including glycosyltransferases, sugar transporters and mannosidases (see below), 
to become human-like N-glycans. Prefenred human-like glycoproteins which may 
25 be produced according to the invention iaclude those which comprise iV-glycai}s 
having seven or fewer, or three or fewer, mannose residues; comprise one or more 
sugars selected from the group consisting of galactose, GlcNAc, sialic acid, and 
fucose; and comprise at least one oKgosaccharide branch comprising tiie structure 
NeuNAc-Gal-GlcNAc-Man. 
30 [0114] In one embodiment, the host cell has diminished or depleted Dol-P- 
Man:Man5GlcNAc2-PP-Dol Mannosyltransferase activity, which is an activity 
involved in the first mannosyiation step from Man5GlcNAc2-PP-Dol to 
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Man6GlcNAc2-PP-Dol at the luminal side of the ER (e.g., ALG3 Kg. 1; Rg. 2). In 
S.cerevisiae, this enzyme is encoded by the ALG3 gene. As described above, 
S.cere\nsiae cells harboring a leaky alg3-l mutation accranidate Man5GlcNAc2- 
PP-Dol and cells having a deletion in alg3 appear to transfer MansGlcNAci 

5 structures onto nascent polypeptide chains within the ER. Accordingly, in this 
embodiment, host cells will accumulate N-glycans enriched in MansGlcNAcz 
structures which can then be converted to GlcNAc2Man3GlcNAc2 by treatment 
with glycosidases (e.g., with a-1,2 mannosidase, aj-1,3 mannosidase or a-1,2-1,3 
maonosidase activities (Fig. 2). 

10 [0115] As described in Example 1, degenerate primers were designed based on 
an alignment of Alg3 protein sequences from 5. cerevisiae, D. melanogaster and 
humans {K sapiens) (Figs. 4 and 5), and were used to amplify a product from P. 
pastoris genomic DNA The resulting PGR product was used as a probe to identify 
and isolate a P. pastoris genomic clone conaprising an open reading frame (ORF) 

15 that OTCodes a protein having 35% overall sequence identity and 53% sequence 
similarity to the 5. cerevisiae ALG3 gene (Figs. 6 and 7). This P. pastoris gene is 
referred to herein as ''PpALG3'\ The ALG3 gene was similarly identified and 
isolated tomK. lactis (Example 1; Figs. 8 and 9). 

[0116] Thus, in another embodiment, the invention provides an isolated nucleic 
20 acid molecule having a nucleic acid sequence comprising or consisting of at least 
forty-five, preferably at least 50, more preferably at least 60 and most preferably 
75 or more nucleotide residues of the P.pastoris ALG 3gene (Fig. 6) and tiie ^ 
lactis ALG 3gene (Fig- 8), and homologs, variants and derivatives thereof. The 
invention also provides nucleic acid molecules that hybridize under stringmt 
25 conditions to the above-described nucleic acid molecules. Similarly, isolated 
polypeptides (including muteins, alleUc variants, fragments, derivatives, and 
analogs) encoded by the nucleic acid molecules of the invention are provided 
(P.pastoris and K. lactis ALG 3 gene products are shown in Fig. 6 and 8). In 
addition, also provided are vectors, including expression vectors, which comprise a 
30 nucleic acid molecule of the invention, as described furthor herein. 

[01171 Using gene-specific primes, a construct was made to delete the PpALG3 
gene &om the genome of P. pastoris (Example 1). This strain was used to 
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generate a host ceU depleted in Dol-P-Man:Man5GlcNAc2-PP-Dol 
Mannosyltransferase activity and produce lipid-liiiked MansGlcNAcrPP-Dol 
precursors which are transferred onto nascent polypeptide chains to produce N- 
glycaos having a Man5GlcNAc2 carbohydrate structure. 

5 [0118] As described in Example 2, such a host cell may be mgineered by 

expression of appropriate mannosidases to produce N-^ycans having the desired 
Man3GlcNAc2 core carbohydrate structure. Expression of GnTs in the host cell 
(e.g., by targeting a nucleic acid molecule or a library of nucleic add molecules as 
described below) enables the modified host ceU to produce N-glycans having one 

10 or two GlcNAc structures attached to each arm of tiie Man3 core structure (i.e., 
GlcNAciMan3GlcNAc2 or GlcNAczManjGlcNAcz; see Fig. 3). These structures 
may be processed further using the methods of the invention to produce human- 
like N-glycans on proteins which enter the secretion pathway of the host cell. 
[0119] In another embodiment, the host cell has dimioished or depleted dolichyl- 

15 P-Man:Man6GlcNAc2-PP-doHchyl 0£-l^ mannosyltransferase activity, which is an 
mannosyltransferase activity involved in the mannosylation step converting 
Man6GlcNAc2-PP-Dol to Man7GlcNAc2-PP-Dol at the luminal side of the ER (see 
above and Figs. 1 and 2). In S. cerevisiae, this enzyme is encoded by the ALG9 
gene. Cells harboring an alg9 mutation accumulate Man€GlcNAc2-PP-Dol (Fig. 2) 

20 and transfer Man6GlcNAc2 structures onto nascent polypeptide chains within the 
ER. Accordingly, in this embodiment, host cells will accumulate N-glycans 
enriched in Man6GlcNAc2 structures which can then be processed down to core 
Man3 structures by treatment with Q!-l,2 and a-1,3 mannosidases (see Fig. 3 and 
Examples 3 and 4). 

25 [0120] A host cell in which the alg9 gene (or gene encoding an equivalent 

activity) has been deleted is constructed (see, e.g., Example 3). Deletion of ALG9 
(or ALG12; see below) creates a host cell which produces N-glycans with one or 
two additional mannoses, respectively, on tiie 1,6 aim (Fig. 2). In order to make 
the 1,6 core-mannose accessible to N-acetylglucosaminyltransferase II (GnTD) 

30 these mannoses have to be removed by glycosidase(s). ER mannosidase typically 
will remove the terminal 1,2 mannose on the 1,6 arm and subsequently 
Mannosidase n (alpha 1-3,6 mannosidase) or other mannosidases such as alpha 
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1 A a^hal,3 or alpha 1-2,3 mannDsidases (e.g., fmmXanaomonas numihotis; see 
Example 4) can act tipon the 1,6 am and sobsequenfly GnTE can transfer anN- 
acetylglticosatninB, resulting in GlcNAcaMana (Rg. 2). 

[0121] The resulting host cell, which is depleted for alg9p activity, is engineered 
5 to express a-1,2 and a-1,3 mannosidase activity (j&om one or more enzymes, and 
preferably, by expression from a nucleic acid molecule introduced into flie host cell 
and which expresses an enzyme targeted to a preferred subcellular conq?artaient 
(see below). Example 4 describes the cloning and expression of one such enzyme 

from Xanthomonas manihotis. 
10 [0122] la another embodiment, the host ceU has diminished or depleted dolichyl- 

P-Man:Man7GlcNAc2-PP-dohchyl Qf-1,6 mannosyltransferase activity, which is an 
a-1,6 mannosyltransferase activity involved in the mannosylation step converting 
Man7GlcNAc2-PP-Dol to MangGlcNAcz-PP-Dol (wHch mannosylates the as-1,6 
mannose on the 1,6 aim of the core mannose structure) at the luminal side of the 

15 ER (see above and Figs. 1 and 2). hi S. cerevisiae. Ibis enzyme is encoded by the 
ALG12 gene. Cells harboring an algl2 mutation accumulate Man7GlcNAc2-PP- 
Dol (Fig. 2) and transfer Man7GlcNAc2 structures onto nascent polypeptide chains 
within the ER. Accordmgly, in this embodiment, host cells will accumulate N- 
glycans eaniched in Man7GlcNAc2 structures which can then be processed down to 

20 core Man3 structures by treatment with 06-1,2 and os-1,3 mannosidases (see Big. 3 
and Examples 3 and 4). 

[0123] As described above for alg9 mutant hosts, the resulting host cell, which is 
depleted for algl2p activity, is engineered to express Qs-1,2 and 0(-l,3 mannosidase 
activity (e.g., from one or more enzymes, and preferably, by expression fiom one 
25 or more nucleic acid molecules introduced into the host cell and which express an 
enzyme activity which is targeted to a preferred subcellular compartment (see 
below). 
[0124] 

Engineering or Selecting Hosts OptionaUy Having Decreased Initiating 
30 a-1,6 Mannosyltransferase Activity 

[0125] hi a preferred embodiment, the method of the invention involves making 
or using a host cell which is both (a) dhninished or depleted in the activity of an 
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alg gene or in one or more activities that mannosylate N-gJycans on tte a-1,6 aim 
of the Man3GlcNAc2 CManS") core carbohydrate structure; and (b) diminished or 
depleted in the activity of an initiating aTl,6-mamiosyltransferase, ie., an initiation 
specific enzyme that initiates outer chain mannosylation (on the a-1,3 arm of the 
5 Man3 cores stmcture). In S. cerevisiae, this enzyme is encoded by the OCHl gene. 
Disruption of the ochl gene in S.cerennsiae results in a phenotype in which N- 
linked sugars completely lack the poly-mannose outer chain. Previous approaches 
for obtaining mammalian-type glycosylation in fungal strains have required 
inactivation of OCHl (see, e.g., Chiba, 1998). Disruption of the initiating a-1,6- 

10 mannosyltransferase activity in a host cell of tiie invention is optional, however 
(dependmg on the selected host cell), as tiie Ochlp enzyme requires an intact 
Man«GlcNAc for efficient mannose outer chain initiation. Thus, the host cells 
selected or produced according to this invention, which accumulate lipid-linked 
oligosaccharides having seven or fewer mannose residues will, aft^ transfer, 

15 produce hypoglycosylated N-glycans tbat will likely be poor substrates for Ochlp 
(see, e.g., Nakayama, 1997). 

Engineering or Selecting Hosts Having Increased Glncosyltransferase Activity 
[0126] As discussed above, glucosylated oligosaccharides are thought to be 

20 transferred to nascent polypeptide chains at a much higher rate than their 
nonglucosylated counterparts. It appears that substrate recognition by the 
oligosaccharyltransferase complex is enhanced by addition of glucose to the 
antranae of lipid-liDked oligosaccharides. It is thus desirable to create or select 
host cells capable of optimal glucosylation of the lipid-linked oligosaccharides. In 

25 such host cells, underglycosylation will be substantially decreased or even 
abolished, due to a faster and more efficient transfer of glucosylated Mans 
stractures onto the nascent polypeptide chain. 

[0127] Accordingly, in another embodiment of the invention, the method is 
directed to making a host cell in which the lipid-linked N-glycan precursors are 
30 transfenred efficiently to the nascent polypeptide chain in the ER. In a preferred 
embodiment, transfer is augmented by increasing the level of glucosylation on the 
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branches of lipid-linked oligosaccharides widdbi, in turn, will make Ihean better 
substrates for oligosaccdiaijdliansferase. 

[0128] In one preferred embodiment, the invention provides a method for making 
a human-like glycoprotein which uses a host cell in whidi one or more enzymes 

5 responsible for glucosylation of lipid-linked oligosaccharides m the ER has 

increased activity. One way to enhance the degree of glucosylation of the lipid- 
linked oligosaccharides is to overexpress one or more enzymes responsible for the 
transfer of glucose residues onto the antennae of the lipid-linked oligosaccharide. 
In particular, increasing a-1,3 glucosyltransferase activity will increase the amount 

10 of glucosylated Upid-linked Mans structures and will reduce or eliminate the 

underglycosjdation of secreted proteins. In S.cerevisiae, this enzyme is encoded 
hy Has ALG6 gene. 

[01291 Sacctuiromyces cerevisiae ALG6 and its human counterpart have been 
cloned (hnbach, 1999; Reiss, 1996). Due to the evolutionary conservation of the 

15 early steps of glycosjdation, ALG6 loci are expected to be homologous betweesn 
species and may be cloned based on sequence similarities by anyone skilled in the 
art (The same holds trae for cloning and identification of ALG8 anAALGlOlod 
from different species.) In addition, differeait glucosyltransferases from different 
species can then be tested to identify the ones witihi optimal activities. 

20 [0130] The introduction of additional copies of an ALG6 gens and/or the 

expression of ALG6 under the control of a strong promoter, such as the GAPDH 
promoter, is one of several ways to increase the degree of glucosylated Upid-linked 
oligosaccharides. The ALG6 gene from P. pastoris is cloned and e:q»ressed 
(Example 5). ALG6 nucleic acid and amino acid sequences are show in Fig. 25 {S. 

25 cerevisiae) and Fig. 26 (P. pastoris). These sequences are compared to other 
eukaryotic ALG6 sequences in Fig. 27. 

(01311 Accordingly, another embodiment of the invention provides a method to 
enhance ttie degree of glucosylation of Upid-linked oUgosaccharides comprising 
the stq> of increasing alpha-1,3 glucosyltransferase activity in a host celL The 
30 increase in activity may be achieved by overexpression of nucleic acid sequences 
encoding the activity, e.g., by opwatively linking the nucleic acid encoding the 
activity with one or more heterologous ejqpression control sequences. Preferred 
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e3q>ression control sequeaces include transcription initiation, termination, promoter 
and enhancer sequraces; SNA splice donor and polyadaiylation signals; idSNA 
stabilizing sequences; ribosome binding sites; protein stabilizing sequ^ces; and 
protein secretion sequences. 
5 [0132] In another embodiment, the increase in alplia-1,3 glucosyltransferase 

activity is achieved by introducing a nucleic acid molecule encoding the activity on 
a multi-copy plasmid, using teclmiques well known to the skilled woricer. In yet 
another embodiment, the degree of glucosylation of lipid-linked oUgosaccharides 
comprising decreasing the substrate specificity of oligosaccharyl transferase 

10 activity in a host cell. This is achieved by, for example, sxxbjecting at least one 
nucleic acid encoding the activity to a technique such as gene shuffling, in vitro 
mutagenesis, and error-prone polymerase chain reaction, all of which are well- 
known to one of skill in the art Naturally, ALG8 and ALGIO can be 
overexpressed in a host cell and tested in a similar fashion. 

1 5 [0133] Accordingly, in a preferred embodiment, the invention provides a method 
for making a human-like glycoprotein using a host cell which is engineered or 
selected so that one or more enzymes responsible for glucosylation of lipid-linked 
oligosaccharides in the ER has increased activity. In a more preferred 
embodiment, the invention uses a host cell having both (a) diminished or depleted 

20 in the activity of one or more alg gene activities or activities that mannosylate N- 
glycans on the a-1,6 arm of the MausGlcNAca CManS'*) core carbohydrate 
structure and (b) engineered or selected so that one or more enzymes responsible 
for glucosylation of lipid-linked oligosaccharides in the ER has increased activity. 
The lipid-linked Mans structure found in an alg3 mutant background, however, is 

25 not a preferred substrate for Alg6p. Accordingly, the skilled worker may identify 
Alg6p, AlgSp and AlglOp with an increased substrate specificity (Gibbs, 2001) 
e.g., by subjecting nucleic acids encoding such enzymes to one or more rounds of 
gene shuffling, error prone PGR, or in vitro mutagenesis approaches and selecting 
for increased substrate specificity in a host cell of interest, using molecular biology 

30 and genetic selection techniques well known to those of skill in the art. It will be 
appreciated by the skilled worker that such techniques for improving enzyme 
substrate specificities in a selected host strain are not limited to this particular 
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embodimait of the inventioii but ralher, may be used in any embodiment to 
optimize fortJier the production of human-like N-glycans in a non-human host cell. 
[0134] As described, once Mans is transfiared onto Ihe nascent polypeptide 
chain, expression of suitable a-U-mamiosidase(s), as provided by the preseat 

5 invention, will further Irim MansGlcNAcz structures to yield flie desired core 

MansGlcNAcz structures. a-l,2-niaimosidases remove only terminal a-l,2-linked 
mannose residues and are expected to recognize the MansGlcNAca - 
Man7GlcNAc2 specific structures made in alg3, 9 and 12 mutant host ceUs and in 
host cells in which homologs to these genes are mutated. 

10 [0135] As schiematicaUy presented in Figure 3, co-expression of appropriate 

UDP-sugar-transporter(s) and^transferase(s) wiU cap the temiinal a-1,6 and a-1,3 
residues with GlcNAc, resulting in the necessary precursor fi»r mammalian-type 
conq>lex and hybrid N-glycosyiation: GlcNAc2MansGlcNAc2. Ihe peptide-bound 
N-linked oligosaccharide chain GlcNAczMansGlcNAcz (Ftgure 3) then serves as a 

15 precursor for finfher modification to a mammahan-typeoUgosaccharidesbiictu^ 
Subsequent expression of galactosyl-1ranf«ases and genetically engineering the 
cq)acity to transfer sialylic add will produce a maimnalian-type (e.g., human-like) 
N-glycan structure. 

[01361 A desired host cell according to tiie invention can be engineered one 
20 CTzyme or more than one enzyme at a time. In addition, a library of genes 

encoding potentially usefiil enzymes can be created, and a strain having one or 
more enzymes with optimal activities or producing the most "human-like" 
glycoproteins, selected by transforming target host cells with one or more members 
of the Hbrary. Lower eukaryotes that are able to produce glycoproteins having the 
25 core ^-glycan Man3GlcNAc2 are particularly useful because of the ease of 
p«foiming genetic manipulations, and safety and ef&ciency features. In a 
preferred embodiment, at least one further gjycosylation reaction is perfonned, ex 
vivo or in vivo, to produce a human-like N-glycan. hi a more preferred 
embodiment, active forms of glycosylating enzymes are expressed in the 
30 endoplasmic reticulum and/or Golgi qjparahis of the host cell to produce the 
desired human-like glycoprotem. 
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Host Cells 

[0137] A preferred non-human host cell of the invention is a lower eokaryotic 
cell, e.g., a unicellular or filamentous fungus, which is diminished or depleted in 
the activity of one or more alg gene activities (including an enzymatic activity 
5 which is a homolog or equivalent to an alg activity). Another preferred host cell of 
the invention is diminished or depleted in the activity of one or more enzymes 
(other than alg activities) that mannosylate the a-1,6 arm of a lipid-Hnked 
oHgosaccharide structure. 

[0138] While lower eukaryotic host cells are preferred, a wide variety of host 
1 0 cells having the aforementioned properties are envisioned as being useful in the 
metiiods of the invention. Plant cells, for mstance, maybe engineered to express a 
human-like glycoprotein according to the invention. likewise, a variety of non- 
human, TPflmiTiflliaTi host cells may be altered to express more human-like 
glycoproteins using the methods of the inveaition. An q)propriate host cell can be 
15 engineered, or one of the many such mutants aheady described in yeasts may be 
used A preferred host cell of the invention, as ©cemplified herein, is a 
hyperaiannosylation-minus (OCHl) mutant m Pichia pastoris which has further 
been modified to delete the algS gene. Other preferred hosts are Pichia pastoris 
mutants having ochl and alg 9 or algl2 mutations. 

20 

Formation of complex N-glycans 

[0139] The sequential addition of sugars to tiie modified, nascent N-glycan 
structure involves the successful targeting of gjucosyltransferases into the Golgi 
apparatus and their successful expression. This process requires the functional 
25 expression, e.g., of GnT I, in tiie early or medial Golgi apparatus as well as 
ensuring a sufficient supply of UDP-GlcNAc (e.g., by expression of aUDP- 
GlcNAc transporter). 

[0140] To characterize the glycoproteins and to confirm the desired 
glycosylation, the glycoproteins were purified, the N-glycans were PNGase-F 
30 released and then analyzed by MALDI-TOF-MS OExample 2). Kringle 3 domain 
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ofhnman plasminogen was used as lie reporter protein. This soluble glycoprotein 
was produced in P. pastoris in an algS, ochl knockout background (Example 2). 
[0141] GlcNAcMansGlcNAcz was produced as the predominant N-glycan after 
addition of human GnT I, and £ lactis UDP-GlcNAc transporter in Fig. 16 

5 (Example 2). The mass of this N-glycan is consistent with the mass of 

GlcNAcMansGlcNAcz at 1463 (m/z). To confirm the addition of the GlcNAc onto 
MansGlcNAca, a /S-N-hexosaminidase digest was paformed, which revealed a 
peak at 1260 (m/z), consistent with the mass of MansGlcNAci (Fig.17). 
[0142] The N-glycans fix»m die alg3 ochl deletion in one strain PBP3 (Example 

10 2) provided two distinct peaks at 1 138 (m/z) and 1300 (m/z), which is consistent 
with structures GlcNAcManjGlcNAcz and GlcNAcMan4GlcNAc2 (Fig. 18). After 
an in vitro al ^mannosidase digestion for redundant mannoses, a peak eluted at 
1138 (m/z), which is consistent with GlcNAcMansGlcNAcj (Fig. 19). To confirm 
the addition of the GlcNAc onto the Man3GlcNAc2 structure, a ^N- 

15 hexosaminidase digest was performed, which revealed a peak at 934 (m/z), 
consistent with die mass of MansGlcNAci CFi& 20). 

[01431 The addition of the second GlcNAc onto GlcNAcMan3GlcNAc2 is shown 
in Rg. 21. The peak at 1357 (m/z) corresponds to GlcNAc2Man3GlcNAc2. To 
confirm the addition of flie two GlcNAcs onto the core mannose stracture 

20 MansGlcNAcz. another jS-N-hexosaminidase digest was performed, which revealed 
a peak at 934 (m/z), consistent with the mass of Man3GlcNAc2 (Fig. 22). This is 
conclusive data displaying a complex-type glycoprotran made in yeast cells. 
[0144] The in vitro addition of UDP-galactose and p 1,4-galactosyitransferase 
onto the GlcNAc2Man3GlcNAc2 resulted in a peak at 1664 (m/z), which is 

25 consistent with the mass of GalzGlcNAcjMansGlcNAcz (Fig. 23) Fmally, the in 
vifro addition of CMP-N-acetyhieuraminic acid and sialyltransferase resulted in a 
peak at 2248 (m/z), which is consisteait with the mass of 

NANA2Gal2GlcNAc2Man3GlcNAc2 (Fig. 24). The above data supports the use of 
non-mammalian host cells, which are enable of producing complex human-like 
30 glycoproteins. 
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Targeting of glycosyl- and galactosyl-transferases to specific organelles. 
[0145] Much woric has been dedicated to revealing the exact mechanism by 
which these enzymes are retained and anchored to their respective organelle. 
AlthoTigh coxnplex, evidence suggests that, stem region, membrane spanning 
5 region and cytoplasmic tail individually or in concert direct enzymes to the 
membrane of individual organelles and thereby localize the associated catalytic 
domain to that locus. 

[01461 The method by which active glycosyltransferases can be expressed and 
directed to the appropriate organelle such that a sequential order of reactions may 
10 occur, that leads to complex N-glycan formation, is as follows: 

(A) Establish a DNA hbrary of regions that are known to encode proteins/peptides 
that mediate localization to a particular location in the secretory pathway (ER, 
Golgi and trans Golgi network). A limited selection of such enzymes and their 
respective location is shown in Table 1 . These sequences may be selected from 

15 the host to be engineered as well as other related or unrelated organism. Generally 
such sequences fell into three categories: (1) N-terminal sequences encoding a 
cytosolic tail (ct), a transmembrane domain (tmd) and part of a somewhat more 
ambiguously defined stem region (sr), which togeflier or individually anchor 
proteins to the inner (lumenal) membrane of the Golgi, (2) retrieval signals which 

20 are generally found at the C-tenninus such as the HDEL or KDEL tetrapeptide, 
and (3) membrane spaiming nucleotide sugar transporters, which are known to 
locate in the Golgi. In the first case, where the localization region consists of 
various elements (ct, tmd and sr) the hbrary is designed such that the ct, the tmd 
and various parts of the stem region are represented. This may be accomplished by . 

25 using PGR primers that bind to the 5' end of the DNA encoding the cytosolic 

region and employing a series of opposing primers that bmd to various parts of the 
stem regioEL In addition one would create fusion protein constructs that encode 
sugar nucleotide transporters and known retrieval signals. 

(B) A second step involves the creation of a srnes of fusion protein constructs, 
30 that encode the above mentioned localization sequences and the catalytic domain 

of a particular glycosyltransferase cloned in frame to such localization sequence 
(e.g. GnT I, GalT, Fucosyltransferase or ST). In the case of a sugar nucleotide 
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transporter fused to a catalytic domain one may design soch constructs such that 
Hie catalytic domain (e.g. GnT I) is dther at the N- or the C-tenninns of the 
resulting polypeptide. The catalytic domain, like the localization sequence, may be 
derived fiom various different sources. The choice of such a catalytic domains 
5 may be guided by ihe knowledge of flie particular environment in which the 

catalytic domain is to be active. For example, if aparticular glycosyitransferase is 
to be active in the late Golgi, and aU known enzymes of the host organism in the 
late Golgi have a pH optimum of 7.0, or the late Golgi is known to have a 
particular pH, one would try to select a catalytic domain that has maximum activity 
10 at that pH. Existing in vivo data on the activity of such enzymes, in particular 
hosts, may also be of use. For example, Schwientek and coworkers showed that 
GalT activity can be engineered into the Golgi of S.cerevisiae and showed that 
such activity was present by demonstrating the transfer of some Gal to existing 
GlcNAc2 in an d!g mutant of S. cerevisiae. In addition, one may perform several 
15 rounds of gene shufQing or error prone PGR to obtain a larger diversity within the 
pool of fusion constructs, since it has been shown fliat single amino mutations may 
drastically alter flie activity of glycoprotean processmg enzymes (Etomero et al., 
2000). Full Iraigfli sequences of glycosyltransferases and their endogenous 
anchoring sequence may also be used. In a preferred embodiment, such 
20 localization/catalytic domain hTjraries are designed to incorporate existing 
information on the sequential nature of gjycosylation reactions in higher 
eukaiyotes. In other words, reactions known to occur early in the course of 
glycoprotein processing require the targeting of enzymes that catalyze such 
reactions to an early part of the Golgi or the ER. For example, the trimming of 
25 MansGlcNAca to MansGlcNAcj is an early step in complex N-glycan fiarmation. 
Since protem processing is initiated in the ER and then proceeds through the early, 
medial and late Golgi, it is desirable to have this reaction occur in flie ER or early 
Golgi. When designing a Ubrary for mannosidase I localization, one thus attempts 
to match ER and early Golgi targeting signals with the catalytic domain of 

30 mannosidase L 

[0147] Upon transformation of the host strain with the fusion construct Ubrary a 
selection process is used to identify which particular combination of locaUzation 
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sequence and catalytic domain in fact have the TnayiTni i Tn effect on the 
carbohydrate stractiure&imd in such host strain. Such selection can be based on 
anynumberofassays or detection methods. They may be earned ont manually or 
may be automated through the use of hi^ troughput screening equipment 

5 [01481 In another example, GnT I activity is required for the maturation of 
complex N-glycans, because only after addition of GlcNAc to the tenninal al,3 
mannose residue may further trimming of such a structure to the subsequent 
intermediate GlcNAcMansGlcNAca structure occur. Mannosidase H is most likely 
not capable of removing the terminal al,3- 0^1.6- mannose residues in the 

10 absrace of a taminal pi^-GlcNAc and thus the formation of complex N-glycans 
will not proceed in the absence of GnT I activity (Schachter, 1991). Altematively, 
one may JBxst engmeer or select a strain that makes sufficient quantities of 
Man5GlcNAc2 as described in this invention by engineering or selecting a strain 
deficient in Alg3P activity. In the presence of sufficient UDP-GlcNAc transports: 

15 activity, as may be achieved by engineering or selecting a strain fliat has such 
UDP-GlcNAc transporter activity, GlcNAc can be added to the terminal a-1,3 
residue by GnTI as in vitro a Mans structure is recognized by by rat liver GnTI 
(MoUer, 1992). 

[01491 In another approach, one may incorporate the expression of a UDP- 
20 GlcNAc transporter into the library mentioned above such tiiat the desired 
construct will contain: (1) a region by which the transformed construct is 
maintained in the cell (e.g. origin of replication or a region that mediates 
chromosomal integration), (2) a marker gene that allows for the selection of cells 
that have been transformed, including counterselectable and recyclable markers 
25 such as ura3 or T-urflS (Soderhohn, 2001) or other weU characterized selection- 
markers (e.& his4, bla, Sh ble etc.), (3) a gene encoding a UDP-GlcNAc 
transporter (e.g. fromiC/acrts, (Abeijon, 1996), or fromKsapiens (Ishida, 1996). 
and (4) a promoter activating the expression of the above mentioned 
localization/catalytic domain fusion construct library. 
30 [0150] After transformation of the host with the library of fusion constructs 

described above, one may screen for those cells that have the highest concentration 
of terminal GlcNAc on the cell surface, or secrete the protem with tiie bluest 
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terminal GlcNAc conteat Such a screen maybe based on a visual method, like a 
staining procedure, flie ability to bind specific tenninal GlcNAc binding antibodies 
or lectins conjugated to a marker (such lectins are available fiom E. Y. Laboratories 
Inc., San Mateo, CA), Ihe reduced abiUty of specific lectins to bind to tenninal 
5 mannose residues, Ihe abihty to incorporate a radioactively labeled sugar in vitro, 
altered binding to dyes or charged surfeces, or may be accompUshed by using a 
Fluorescence Assisted CeU Sorting (FACS) device in conjimction with a 
fluorophore labeled lectin or antibody (Guillen, 1998). It may be advantageous to 
enrich particular phenotypes within the transformed population vnth cytotoxic 
10 lectins. U.S. Patent No. 5,595,900 teaches several methods by which cells with a 
desired extra-ceUular carbohydrate structures may be identified. Repeatedly 
carrying out this strategy allows for the sequential engineering of more and more 
complex glycans in lower eukaryotes. 

[01511 After transformation, one may select for transformants that allow for the 

15 most efficient transfer of GlcNAc by GlcNAc Transferase H from UDP-GlcNAc in 
an in vitro assay This screen maybe carried out by growing cells harboring the 
transformed Ubrary under selective pressure on an agar plate and transferring 
individual colonies into a 96-well mictotiter plate. After growing the cells, the 
cells are centrifiiged, the cells resuspended in buffer, and after addition of UDP- 

20 GlcNAc and GnT V, the release of UDP is determined either by HPLC or an 

enzyme hnked assay for UDP. Alternatively, one may use radioactively labeled 
XJDP-GlcNAc and GnT V, wash the cells and then look for the release of 
radioactive GlcNAc by N-actylglucosammidase. All iHs may be carried manually 
or automated through tte use of hi^ throughput satsening equq>ment 

25 [0152] Transformants that release more UDP, in die first assay, or more 

radioactively labeled GlcNAc in the second assay, are expected to have a hi^er 
degree of GlcNAcManaGlcNAcz (Fig. 3) on their surface and thus constitute the 
desired phenotype. Alternatively, one may any use any other suitable screen such 
as a lectin binding assay that is able to reveal altered glycosylation patterns on the 

30 surfece of transformed cells. ]h this case the reduced binding of lectins specific to 
terminal mannoses may be a suitable selection tool. Galantus nivalis lectin binds 
specifically to tenninal a-1,3 mannose, which is expected to be reduced if 
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sufficient mamiosedase n activity is present in ftie Golgi. One may ako enrich for 
desired transfoimants by carrying out a chromatographic separation step that 
allows for tiie removal of cells containing a hi^ t ermmal mannose content This 
separation step would be carried out with a lectin column that specifically binds 

5 cells with a hi^ terminal mannose content (e.g Galantus nivalis lectin bound to 
agarose , Sigma, StLouis, MO) ov&c those that have a low terminal mannose 
content In addition, one may directly create such fusion protem constructs, as 
additional infoinmtion on the localization of active carbohydrate modi^ 
enzymes in different lower eukaryotic hosts becomes available in the sciCTtific 

10 literature. For example, the prior art teaches us that human p 1,4-GalTr can be 

fused to the membrane domain of MNT, a mannosyltransferase from S, cerevisiae, 
and localized to the Golgi apparatus while retaining its catalytic activity 
(Schwientek et al., 1995). If 5. cerevisiae or a related organism is the host to be 
engmeered one may directly incorporate such findings into the overall strategy to 

15 obtain complex N-glycans from such a host. Several such gene fi-agments in 
P.pastoris have been identified that are related to glycosyltransferases in 
S-cerevisiae and thus could be used for that purpose. 
Table 1 



Gene or 


Organism 


Function 


Location of eene 
product 


sequence 






MnsI 


S,cerevisiae 


maimosidase 


ER 


Ochl 


S.cerevisiae 


1 ,6-maxmosyltransferase 


Golgi (cis) 


]Vhm2 


S.cerevisiae 


l,2-mannosyltransferas6 


Golgi (medial) 


Mrml 


S.cerevisiae 


1 ,3-mannosyltransfCTase 


Golgi (trans) 


Ochl 


P.pastoris 


1 ,6-mannosyltransferase 


Golgi (cis) 


2,6 ST 


H.sapiens 
S.frugiperda 


2,6-sialyltransferase 


trans-Golgi network 


i51,4GalT 


bovine milk 


UDP-Gal transporter 


Golgi 


Mntl 


S.cerevisiae 


1,2-marmosyltransferase 


Golgi (cis) 


HDELatC- 

terminus 


S,cerevisiae 


retrieval signal 


ER 
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Integration Sites 

[01531 As one ultimate goal of Ihis genetic engineering effort is a robust protein 
production strain liiat is able to perform weU in an industrial fomentation proc^ 

tiie integration of multiple genes into Ihe host (e.g.. fungal) chromosome involves 

5 careful plamiiag. The engineered strain will most likely have to be transformed 
with a range of different genes, and these genes will have to be transformed in a 
stable fashion to ensure that the desired activity is maintained throu^ut the 
fermentation process. Any combination of tiie foUowing enzyme activities will 
have to be engineered into the fungal protein expression host: sialyltransferases, 

10 mannosidases, fiicosyltiansferases, galactosyltransferases, glucosyltransferases, 
GlcNAc transferases, ER and Golgi specific transporters (e.g. syn and antiport 
transporters for UDP-galactose and other precursors), other enzymes involved in 
the processing of oUgosaccharides, and enzymes involved m the synthesis of 
activated oli^saccharide precursors such as UDP-galactose, CMP-N- 

15 acetyhieuraminicadd. At the same time, a number of genes which encode 

enzymes known to be characteristic of non-human glycosylation reactions, will 
have to be deleted. Sudi gesnes and their corresponding proteins have beeai 
extensively characterized in a number of lower eukaryotes (e.g. S.cerevisiae. 
T.reeseU A. nidulans etc.), thereby providing a list of known glycosyltiransferases 

20 in lower eukaryotes, then activities and then respective genetic sequence. These 
graes are likely to be selected from the groiqj of mannosyltiansferases e.g. 1,3 
mannosjdtransferases (e.g. MNNl 'm.S.cerevisiae) (Graham, 1991), 1,2 
mannosyltransferases (e.g. KTR/KRE fiamily from S.cerevisiae), 1,6 
mannosyltransferases (OCHl ftom S.cerevisiae), mannosylphosphate transferases 

25 (MNN4 and MNN6 from S.cerevisiae) and additional enzymes that are involved in 
aberrant i.e. non human glycosylation reactions. Many of these genes have in feet 
been deleted individually giving rise to viable phenotypes with altered 
glycosylation profiles. Examples are shown in Table 2: 
Table 2. 



Strain 


Matant 


Structure wild 


Structure 
mutant 


Authors 




tyge 




Schizosaccharomyces 
pombe 


OCHl 


Mannan (i.e. 
Man>9GlcNAc2) 


Man«GlcNAc2 


YokD-oetal., 2001 
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S.cerevisiae 


OCHl 

MNNl 


Maiinaii(ie. 
Mait9GlcNAc2) 


Maii8GlcNAc2 


Nakanishi-Shindo 
etal,. 1993 


Sxerevisiae 


OCHl 
MNNl 
MNN4 


Mannan (i.e. 
Man>9GlcNAc2) 


Maii8GlcKAc2 


ChibaetaL, 1998 



As any strategy to engineer the formation of complex N-glycans into a lower 
eukaryote involves both the elimination as well as the addition of 
glycosyltransferase activities, a comprehensive scheme will attempt to coordinate 
5 both requirements. Genes that encode enzymes that are undesirable serve as 
potential integration sites for genes that are desirable. For example, 1,6 
mannosyltransferase activity is a hallmark of glycosylation in many known lower 
eukaiyotes. The gene encoding alpha-1,6 mannosyltransferase {OCHl) has been 
cloned from S.cerevisiae and mutations in the gene give raise to a viable phenotype 

10 with reduced mannosylation. The gene locus encoding alpha-1,6 

mannosyltransferase activity therefor is a prime target for flie integration of genes 
encoding glycosyltransferase activity. la a similar manner, one can choose a range 
of other chromosomal integration sites that, based on a gene disruption event in 
that locus, are expected to: (1) improve the cells ability to glycosylate in a more 

15 human like fashion, (2) improve the cells ability to secrete proteins, (3) reduce 

proteolysis of foreign proteins and (4) improve other characteristics of the process 
that facilitate purification or the feimeatation process itself. 
Providing sugar nucleotide precursors 

[0154] A hallmark of higher eukaryotic glycosjdation is the presence of 
20 galactose, fiicose, and a high degree of terminal sialic acid on glycoproteins. 
These sugars are not generally foimd on glycoproteins produced ia yeast and 
filamentous fungi and the method discussed above allows for the engineering of 
strains that localize glycosyltransferase in the desired organelle. Formation of 
complex N-glycan synthesis is a sequential process by which specific sugar 
25 residues are removed and attached to the core ohgosaccharide structure. In higher 
eukaryotes, this is achieved by having the substrate sequentially exposed to various 
processing enzymes. These enzymes carry out specific reactions depending on 
their particular location within the entire processing cascade. This "assembly line" 
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consists of ER, early, medial and late Golgi, and Hie trans Golgi networic aU with 
their specific processing environment To reoreate the processing of human 
glycoproteins in the Golgi and ER of lower enkaiyotes, numerous enzymes (e.g. 
glycosytoansfrarases, glycosidases, phosphatases and transporters) have to be 

5 expressed and specifically targeted to fbese organelles, and preferably, in a location 
so that they function most efficiently in relation to their environment as well as to 
other enzymes in the pathway. [0155] Several individual glycosyltransferases 
have been cloned and expressed in S.cerevisiae (GalT, GnT I), Aspergillus 
nidulans (GnT 1) and oth^ fungi, without however demonstrating the desired 

10 outcome of 'liumanization" on the glycosylation pattem of the organisms 

(Yoshida, 1995; Schwieutek, 1995; Kalsner, 1995). It was speculated that the 
carbohydrate structure required to accept sugars by the action of such 
glycosyltransferases was not present in sufficient amounts. While this most likely 
contributed to the lack of complex N-glycan formation, there are currently no 

15 reports of a fungus supplying a Man5GlcNAc2 structure, having GnT I activity and 
having UDP-Gn transporter activity engineered into the fungus. It is the 
combination of these three biochemical events that are required for hybrid and 
complex N-gJycan formation. 

[0156] In humans, the full range of nucleotide sugar precursors (e.g. UDP-N- 
20 acetylglucosamine, IJDP-N-acetylgalactosamine, CMP-N-acetyhieuraminic acid, 
UDP-galactose, etc.) are generally synthesized in the cytosol and transported into 
the Golgi, where they are attached to the core oligosaccharide by 
glycosyltransferases. To rq)Ucate this process m lower eukaryotes, sugar 
nucleoside specific transporters have to be expressed in the Golgi to ensure 
25 adequate levels of nucleoside sugar precursors (Sommers, 1981; Sommers, 1982; 
Perez, 1987). A side product of this reaction is eithfer a nucleoside diphosphate or 
monophosphate. While monophosphates can be directly exported in exchange for 
nucleoside triphosphate sugars by an antiport mechanism, diphospho nucleosides 
(e.g. GDP) have to be cleaved by phosphatases (e.g. GDPase) to yield nucleoside 
30 monophosphates and inorganic phosphate prior to being exported. This reaction 
appears to be important for efficient glycosylation, as GDPase from S.cerevisiae 
has been found to be necessary for mannosylation. However, the en2yme only has 
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10% of the activity towards UDP (Beminsone, 1994). Lower eukaiyotes often do 
not have UDP specific diphosphatase activity m the Golgi since they do not utilize 
UDP-sugar precursors for glycoprotein synthesis in the Grolgi. 
[01571 Schizosaccharomyces pombe, a yeast found to add galactose residues to 
5 cell wall polysaccharides (from UDP-galactose) was found to have specific 
UDPase activity further suggesting flie requirement for such an enzyme 
(Beminsone et al., 1994). UDP is known to be a potent inhibitor of 
glycosyltransferases and the removal of this glycosylation side product is 
important in order to prevent glycosyltransferase inhibition in the lumen of the 

10 Golgi (Khatara et al., 1974). Thus, one may need to provide for the removal of 
UDP, which is expected to accumulate in the Golgi of such an engineered strains 
(Beminsone, 1995; Beaudet, 1998). [0158] hi another example, 2,3 
sialyltransferase and 2,6 sialyltransferase cap galactose residues with siaHc acid in 
the trans-Golgi and TGN of humans leading to a mature form of the ^ycoprotein. 

15 To reengineer this processing step into a metabohcally engineered yeast or fungus 
wiU require (1) 2,3-sial)itransferase activity and (2) a sufScient supply of CMP-N- 
acetyl neuraminic acid, in the late Golgi of yeast To obtain suflBLcient 2,3- 
sialyltransferase activity in the late Golgi, the catalytic domain of a known 
sialyltransferase (e.g. from humans) has to be directed to the late Golgi in fimgi 

20 (see above). Likewise, transporters have to be engineered to that allow the 

transport of CMP-N-acet>i neuraminic acid into the late Golgi. There is cuirentiy 
no indication that fimgi synthesize suflBcient amounts of CMP-N^acetyl neuraminic 
acid, not to mention the transport of such a sugar-nucleotide into the Golgi. 
Consequently, to msure the adequate supply of substrate for the corresponding 

25 glycosyltransferases, one has to metabohcally engineer the production of CMP- 
siahc acid into the fimgus. 

Methods for providing sugar nucleotide precursors to the Golgi apparatus: 

UDP-N'<icetyl'glucosamine 
30 [0159] The cDNA of human UDP-N-acetylglucosamine transporter, which was 
recognized through a homology search in the expressed sequence tags database 
(dbEST) was cloned by Ishida and coworkers (Ishida, 1999). Guillen and 

A9 
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cowodcers have cloned flie mamTnalian Golgj membrane transports for UDP-N- 
. acetylghxcosamine by phenotypic collection wifli cDNA &om canine kidney cells 
(MDCK) of a recently characterized iSuyveromyces lactis mutant defident in 
Golgi transport of the above nucleotide sugar (Guillen, 1998). Their results 

5 demonstrate that the mammaKan Golgi UDP-GlcNAc transporter gene has all of 
the necessary informalion for the protein to be expressed and targeted fonctionally 
to the Golgi apparatus of yeast and that two protems with very different amioo acid 
sequences may transport the same solute within the same Golgi membrane 
(Guaien, 1998). 

10 GDP-Fucose 

[0160] The rat liver Golgi membrane GDP-fucose transporter has been identified 
and purified by PugUelli, L. and C. B. Hirschberg (PuglielU, 1999). The 
correq)onding gene has not berai idaitified howeva: N-terminal sequencing can be 
used for the design of oligonucleotide probes specific for the corresponding gene. 
15 These oligonucleotides can be used as probes to clone the gene encoding for GDP- 
fijcose transporter. 
UDP-Gcdactose 

[0161] Two heterolop>us genes, gmal2(+) encoding alpha 1,2- 
galactosyltransfcaase (a^ha 1,2 GalT) from Schizosaccharomyces pombe and 

20 (hUGT2) encoding human UDP-galactose (UDP-Gal) transporter, have been 
fimctionally expressed in S.cerevisiae to examine the intracellular conditions 
required for galactosylation. Correlation between protein galactosylation and 
UDP-galactose transport activity indicated that an ecogenous siq>ply of UDP-Gal 
transporter, rather than alpha 1 ,2 GalT played a key role for efficient 

25 galactosylation in 5. cereviriae (Kainuma, 1999). Likewise a UDP-galactose 
transporter from S. pombe was cloned (Aoki, 1999; Segawa, 1999). 

CMP-N-acetylneuraminic acid ( CMP-Sialic acid) 
[0162] Human CMP-sialic add transporter (hCST) has been cloned and 
expressed in Lec 8 CHO cells (Aoki, 1999; Eckhardt, 1997). The functional 

30 expression of the murine CMP-sialic acid transports was achieved in 

Saccharomyces cerevisiae (Beminsone, 1997). Sialic add has been found in some 
fun^ however it is not clear whether flie chosen host system will be able to supply 
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sufficient levels of CMP-Sialic acid. Sialic add can be either s>q>plied in the 
medium or alternatively fungal paflrways involved in sialic acid synthesis can also 
be integrated into the host genome. 

5 Diphosphatases 

[0163] When sugars are transferred onto a glycoprotein, other a nucleoside 
diphosphate or monophosphate, is released from the sugar nucleotide precursors. 
. While monophosphates can be directly exported in exchange for nucleoside 
triphosphate sugars by an antiport mechanism, diphospho nucleosides (e.g. GDP) 

10 have to be cleaved by phosphatases (e.g. GDPase) to yield nucleoside 

monophosphates and morganic phosphate prior to being exported. This reaction 
q)pears to be miportant for ef&cirait glycosylation, as GDPase from S.cerevisiae 
has been found to be necessary for mannosylation. However, the enzyme only has 
10% of the activity tovsrards UDP (Benunsone, 1994). Lower eukayotes often do 

15 not have UDP specific diphosphatase activity in tiie Golgj since they do not utilize 
UDP-sugar precursors for glycoprotdn synthesis in the GolgL 
Schizosaccharomyces pombe, a yeast found to add galactose residues to cell wall 
polysaccharides (from UDP-galactose) was found to have q>ecific UDPase activity 
fiffther suggesting the requirement for such an enzyme (Betninsone, 1994). UDP 

20 is known to be a potent inhibitor of ^ycosjdtransferases and the ronoval of this 
glycosylation side product is important in order to prevent gjyoosyltransferase 
inhibition in the lumen of the Golgi (Khataia et aL 1974). 

Egression Of GnTs To Produce Complex N-glycans 

25 

Ry pression Of GpT-IH To Boost Antibody Functionality 
[0164] The addition of anN-acetylglucosamine to the GlcNAciMansGlcNAca 
structure by N-acetylglucosaminyltransferases n and III yields a so-called bisected 
N-glycan GlcNAcsMansGlcNAca (Fig. 3). This structure has been implicated in 
30 greater antibody-dependent cellular cytotoxicity (ADCC) (Umana et al. 1999). Re- 
engineaing glycoforms of immunoglobulins expressed by mammalian cells is a 
tedious and cumbersome task. Especially in the case of GnTm, where ova:- 
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expression of this enzyme has been impUcated in growth inhibition, mefliods 
involving regulated (inducible) gene expression had to be employed to produce 
immunoglobulins with bisected N-glycans (Umana et al 1999a, 1999b). 
[0165] Accordingly, in another embodiment, the invention provides systems and 

5 methods for producing human-like N-glycans having bisecting N- 

acetylglucosamine (GlcNAcs) on tiie core mannose structure. In a preferred 
embodiment, iht invmtion provides a system and method for producing 
immunoglobulins having bisected N-glycans. The systems and melJiods described 
herein will not suffer from previous problems, e.g., cytotoxicity associated with 

10 overexpression of GnTDI or ADCC, as the host cells of the invention are 

engineered and selected to be viable and preferably robust cells which produce N- 
glycans having substantiaUy modified human-type glycoforms such as 
GlcNAc2Man3GlcNAc2. Thus, addition of a bisecting N-acetylglucosamine in a 
host cell of the invention will have a negUgible effect on the growli-phenotype or 

15 viability of those host cells, 

[0166] In addition, previous work (Umana) has shown that Iheie is no linear 

correlation between GnUn expression levels and the degree of ADCC. Finding 

the optknal expression level in mammalian cells and maintaining it throughout an 

FDA approved fermentation process seems to be a challenge. However, in cells of 

20 the invention, such as fimgal cells, finding apromoter of ^piopriate slieiigth to 

establish a robust, reUable and optimal GnTDI expression level is a comparatively 

easy task for one of skill in the art 

[0167] A host cell such as a yeast strain enable of producing glycoprotems with 
bisecting N-glycans is engineered according to the invention, by introducing into 

25 the host cell a GnTIH activity (Example 6). Preferably, the host ceU is 

transformed with a nucleic acid that encodes GnTUI (see, e.g., Fig. 32) or a 
domain thereof having enzymatic activity, optionaUy ftised to a heterologous cell 
signal targeting peptide (e.g., usiug the hTiraries and associated methods of the 
invention.) Host cells engineereded to express GnTIH wiU produce higher 

30 antibody titers than mammalian cells are capable of They will also produce 
antibodies with hi^er potency with respect to ADCC. 
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[0168] Antibodies produced by mammalian cell lines transfected with GnTm 
have been shown to be as effective as antibodies produced by non-transfected cell- 
liaes, but at a 10-20 fold lower concmtration (Davies et aL 2001). An increase of 
productivity of the production vehicle of the invention over m a mmali a n systems by 
5 a factor of twenty, and a ten-fold increase of potency will result in a net- 
productivity improvement of two hundred. The invention thus provides a system 
and method for producmg higji titers of an antibody having high potmcy (e.g., up 
to several orders of magnitude more potent than what can currently be produced). 
• The system and method is safe and provides high potency antibodies at low cost in 
10 short periods of time. Host calk engmeered to express GnT III according to the 
invention produce immunoglobulins having bisected N-glycans at rates of at least 
50 mg^ter/day to at least 500 mg/liter/day. In addition, each immunoglobulin (Ig) 
molecule (comprising bisecting GlcNAcs) is more potent than the same Ig 
molecule produced without bisecting GlcNAcs. 

15 

nmmf r and expression of GnT-IV and GnT-V 

[0169] All branching structures in complex N-glycans are synthesized on a 
common core-pentasaccharide CM[an3GlcNAc2 or Man alphal-6(Man alphal- 
3)Manbetal-4 GlcNAc betal-4 GlcNAc betal-4 or ManaGlcNAcz) by N- 

20 acetylglucosamine transferases (GnTs) -I to -VI (Schachter H et aL (1989) 

Methods Enzymo;179:35U97). Cmrent understanding of flie biosynthesis of more 
highly branched N-glycans suggests that after tihe action of GnTE (generation of 
GlcNAc2Man3GlcNAc2 structures) GnTIV transfers GlcNAc from UDP-GlcNAc 
in betal,4 linkage to the Man alphal,3 Man betal,4 aim of GlcNAc2Man3GlcNAc2 

25 N-glycans (Allen SD et al. (1984) J Biol Chem. Jun 10;259(1 1):6984-90; and 

Gleeson PA and Schachter H.J (1983); J.Biol Chem 25;258(10):6162-73) resulting 
in a triantennary agalacto sugar chain. This N-glycan (GlcNAc betal-2 Man 
alphal-6(GlcNAc betal-2 Man alphal-3) Man betal-4 GlcNAc beta 1-4 GlcNAc 
betal,4 Asn) is a common substrate for GnT-EI and -V, leading to the syntiiesis 

30 of bisected, tri-and tetra-antennary structures. Where titie action of GnTDI results 
ia a bisected N~glycan and where GnTV catalyzes the addition of beta l-6GlcNAc 
to the alpha 1-6 maimosyl core, creating flie beta 1-6 branch- Addition of galactose 
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and siaHc add to iJiese branches leads to the generation of a folly sialylated 
complracN-glycan. 

[0170] Branched complex N-glycans have beea mq>Ucated in Hoe physiological 
activity of therapeutic proteins, such as human erythropoietin O^EPO). Human 
5 EPO having bi-antennary stnictures has becai shown to have a low activity, 

whereas hEPO having tetia-antennary struchires resulted in slower clearance from 
the bloodstiream and thus in higher activity (Misaizu T et aL (1995) Blood Dec 
1;86(11):4097.104). 

[0171] With DNA sequence iaformation, the skilled worker can clone DNA 
10 molecules encoding GnT IV and/or V activities (Example 6; Figs. 33 and 34). 
Using standard techniques weU-known to those of skill in tiie art, nucleic acid 
molecules encoding GnT IV or V (or encoding catalytically active fragments 
thereof) maybe inserted into appropriate expression vectors under the 
transcriptional control of promoters and other expression conti-ol sequences 
15 cspable of driving ti^mscription in a selected host cell of die invention, e.g., a 

fongal host such as PuAia sp., Klvyveromyces sp. and Aspergillus sp^ as described 
herein, such fliat one or more of tiiese mammalian GnT enzymes may be actively 
expressed in a host cell of choice for production of a human-like conq)lex 
glycoprotein. 

20 

[01721 The following are examples which illusfaate the compositions and 
metiiods of this invention. These examples should not be constiiied as limiting: 
the examples are included for the purposes of illustration only. 

25 EXAMPLE 1 

Identification, cloning and deletion of the ALG3 gene m P.pastoris and Klactis. 
[0173] Degenerate primers were generated based on an alignment of Alg3 
protdn sequences from .5. cerevisiae, H. sapiens, and D. melanogaster and were 
used to amplify an 83 bp product from P. pastoris genomic DNA: 

30 S'-GGTGTTTTGTTTTCTAGATCTTTGCAYTAYCARTT-S' and 

5*-AGAATrTGGTGGGTAAGAATTCCARCACCAYTCRTG-3' The resulting 
PGR product was cloned into tiie pCR2.1 vector (Invitiogen, Carlsbad, CA) and 
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seqence analysis revealed homology to ]snawiALG3/RHKl/NOT56 homologs 
(Genbank NC_001134.2, AF309689, NC_003424.1). Subsequently, 1929 bp 
upstream and 2738 bp downstream of ttie initial PGR product wae amplified ftom 
a P. pastoris genomic DNA library (Boehm, T. Yeast 1999 May;15(7):563-72) 
5 iiRing the internal oligonucleotides 

5'- CCTAAGCTGGTATGCGTTCTCTTTGCCATATC-3' and 
5'-GCGGCATAAACAATAATAGATGCTATAAAG-3' along with T3 
(5'-AATTAACGCTCACTAAAGGG-3') andT? (5'-GTAA 
TACGACTCACTATAGGGC-3') (lutegrated DNA Technologies, Coralville, lA) 

10 in the backbone of the library bearing plasmid lambda ZAP U (Stratagene, La 
JoUa, CA). The resulting fragments were cloned into the pCR2.1-T0P0 vector 
(Ihvitrogen) and sequaiced. From this sequence, a 1395 bp ORF was identified 
that encodes a protein with 35% idoitity and 53% similarity to the S. cerevisiae 
ALG3 geae (using BLAST programs). The gene was named PpALGS. 

15 [0174] The sequence of i^^G^was used to create a set of primers to generate a 
deletion construct of the PpALG3 gene by PGR overly (Davidson et al, 2002 
Microbiol. 148(Pt 8):2607-15). PrimCTS below were used to ampUfy 1 kb regions 
5' and 3' of flie PpALG3 GKF and the KAN*^ graie, respectively: 
RCD142 (5'-CCACATCATCCGTGCTACATATAG-3'). 

20 RCD144 (5'-ACGAGGCAAGCrAAACAGATCTCXjAAGTATCGAGGGTTAT 

CCAG-3'), 

RCD145(5'-CCATCCAGTGTCGAAAACGAGCCAATGGTTCATGTCTATA 
AATC-3'), 

Ra>147 (5'-AGCCTCAGCGCCAACAAGCGATGG-3'), 
25 RCX>143 (S'-CTGGATAACCCTCGATACTTCGAGATCTGTTTAGCTTGCC 
TCGT-3'), and 

RCD146(5'-GATTTATAGACATGAACCATrGGCTCGTTTTCGACACTGG 
ATGG-3'). 

Subsequently, primers RCD142 and RCD147 were used to overlap the toee 
30 resulting PGR products into a single 3.6 kb alg3::KA2^ deletion allele. 



Identification, cloning and deletion otiheALGS gene iaKlactis. 
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[0175] The ALG3p sequences ftom S. cerevisiae. Drosophila mdanogaster. 
Homo sapiens etc were aligned wilh K. lactis sequences (PENDANT EST 
database). Regions of hi^ homology that wae in common homologs biit distinct 
in exact sequence j&om flie homologs were used to create pairs of degenraate 

5 primers that were directed against genomic DNA ftom the K. lactis strain MGl/2 
(Bianchi et al, 1987). In the case of ALG3. PGR am5)lification with primers KAH 
(5'-ATCCnTACCGATGCTGTAT-3') andKAL-2 (5'- 
ATAACAGTATGTGTTACACGCGTGTAG-3*) resulted in a product that was 
cloned and sequenced and the predicted translation was shown to have a high 

10 degree of homology to Alg3p proteins (>50% to S. cerevisiae Alg3p). 

[0116] The PGR product was used to probe a Southern blot of genomic DNA 
from K. lactis strain (MGl/2) with high stringency (Sambrook et al. 1989). 
Hybridization was observed in a pattem consistent with a single gene. This 
Southern blot was used to map the genomic lod. Genomic fragments were cloned 

15 by digesting genomic DNA and Ugating those fragments in the appropriate size- 
range into pUC19 to create a K. lactis subgenomic horary. This subgenomic 
library was transformed into E. coli and several hundred clones were tested by 
colony PGR, using primers KAL-1 and KAL-2. The clones containing the 
predicted K1ALG3 andKIALG61 genes were sequenced and open reading frames 

20 identified. 

[0177] Primers for constmction of an alg3::NAI^ deletion allele, using a PGR 
overlap mefliod (Davidson et al, 2002), were designed and the resulting deletion 
allele was transformed into two lactis strains and NAT-resistant colonies 
selected. These colonies were screened by PGR and transformants were obtained 
25 in which the ALG3 ORF was replaced with the ochl::NAI* mutant allele. 

EXAMPLE 2 

Generation of an alg3/ochl mutant strain expressing an a-l,2-Mannosidase, 
GnTl and GnTBE for production of a human-like glycoprotdn. 

[0178] The 1215 bp open reading frame of the P. pastoris OCHl gene as well as 
30 2685 bp upstream and 1 175 bp downstream was ampUfied by PGR (B. K. Ghoi et 
al., -submitted to Proc Natl. Acad. Sci. USA 2002; see also WO 02/00879; each of 
which is incorporated herein by reference), cloned into the pGR2.1-TOPO vector 
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(Ihvitrogen) and designated pBK9. To create an ochl knockout strain containing 
multiple auxotrophic maricers, 100 (ig of pJN329, a plasmid containing an 
ochl:: USA3 mutant allele flanked with Sfil restriction sites was digested with 5^ 
and used to transform P. pastoris strain JC308 (Cere^o et al. Gene 263 (2001) 
5 159-169) by electroporation. Following incubation on defined medium lacking 
uracil for 10 days at room temporature, 1000 colonies were picked and re-streaked. 
URA^ clones that were unable to grow at 37°C, but grew at room temperature, 
were subjected to colony PGR to test for ttie correct integration of the ochl::URA3 
mutant aUele. One clone that exhibited the expected PGR pattern was designated 

10 YJN153. The Kiingle 3 domain of human plasminogen (K3) was used as a model 
protein. A Neo^ marked plasmid containing the K3 gene was transformed into 
strain YJN153 and a resulting strain, expressing K3, was named BK64-1 (B. K- 
Choi et al, submitted to Proc. Natl Acad Sci. USA 2002). 
[0179] Plasmid pPB103, containing the SJuyveromyces lactis MNN2-2 gene, 

15 encoding a Golgi UDP-N-acetylglucosamine transporter was constructed by 

cloning a blunt BginrHindm fiagment from vector pDL02 (Abeijon et al. (1996) 
Proc. Natl. Acad. Sci. U.S.A. 93:5963-5968) mto BglH and Banim digested and 
blunt ended pBLAD&5X containing the P. pastoris ADEl gme (Cereghino et al. 
(2001) Gene 263:159-169). This plasmid was lineaiized with-ffct^NI and 

20 transformed iuto strain BK64-1 by electroporation and one strain confirmed to 
contain the MNN2'2 by PGR analysis was named PBPl. 

[0180] A Kbrary of mannosidase constructs was geaerated, comprising in-frame 
fusions of the leader domains of several type I or type II membrane proteins from 
51 cerevisiae and P. pastoris fused with the catalytic domains of several a-1,2- 

25 mannosidase genes from human, mouse, fly, worm and yeast sources (see, e.g., 
WO02/00879, incorporated herein by reference). This Ubrary was created in a P. 
pastoris HIS4 integration vector and screened by linearizing with SaR^ 
transfonning by electroporation into strain PBPl, and analyzing the glycans 
released from the K3 reporter protein. One active construct chosen was a chimCTa 

30 of the 988-1296 nucleotides (C-tOTninus) of the yeast SEC12 gene fused with a N- 
t mmnal deletion of the mouse a-l,2-mannosidase lA (MimMannlA) gene, which 



wo 03/056914 PCT/DS02/41510 



was missing the 187 nucleotides. A P. pastoris strain expressing ihis construct was 
named PBP2. 

[0181] A library of GnTI constructs was geoCTated, coirprising in-frame fusions 
of the same leader Hbrary with the catalytic domains of GnTI genes from human, 

5 worm, frog and fly sources (WO 02/00879). Hiis Ubrary was created in a F. 
pastoris ARG4 integration vector and screened by linearizang with AafSl, 
transforming by electroporation into strain PBP2, and analyzing liie glycans 
released from K3. One active construct chosen was' a chimwa of the first 120 bp of 
the S. cerevisiae MNN9 gene fosed to a deletion of the human GnTI gene, which 

10 was missing the first 154 bp. A P. pastoris strain expressing this construct was 
named PBP3. 

[0182] Subsequently, a P. pastoris alg3::KAtf deletion construct was generated 
as described above. Approximately 5\i% of the resulting PGR product was 
transformed into stain PBP3 and colonies were selected on YPD medium 

15 containing 200p.g/mlG418. One strain out of 20 screened by PGR was confirmed 
to contain the correct integration of the algSr.KAlf mutant allele and lack the 
wild-type allele. This stain was named RDP27. 
[01831 Finally, a library of GnTH constructs was generated, which was 
comprised of in-fiame fiisions of the leader Bbraiy with the catalytic domains of 

20 GnTH genes from human and rat sources (WO 02/00879). This hTjrary was 

created in a P. pastoris integration vector containing the NST^ gene conferring 
resistance to the drug nourseothricin. The Ubrary plasmids were linearized with 
EcdSX transformed into strain RDP27 by electroporation, and the resulting strains 
were screoied by analysis of the released glycans fix>m purified K3. 



25 



Materials 

[01841 MOPS, sodium cacodylate, manganese chloride, UDP-galactose and 
CMP-N-acetyhiearaminic acid were from Sigma. TFA was from Aldrich. 
Recombinant rat a2,6-sial)dtransferase from Spodopterafru^perda and pl,4- 
30 galactosyltransferasefirom bovine milk were from Galbiochem. Protein N- 

glycosidase F, mannosidases, and oligosaccharides were from Gljico (San Rafael, 
CA). DEAEToyoPearl resin was from TosoHaas. Metal chelating "EfisBind" 
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resin was from Novagen (Madison, WI). 96-well lysate-cleaiing plates were fiom 
Promega ^^laxiison, WI). Protein-binding 96-well plates were from MilKpore 
(Bedford, MA). Salts and buffering agents were from Sigma (St Louis, MO). 
MALDI matrices were from Aldrich (Milwaukee, WT). 

5 

Protein Purification 

[0185] Kringle 3 was purified using a 96-well format on a Beckman BioMek 
2000 sample-handling robot (Beckman/Coulter Ranch Cucamonga, CA). Kringle 
3 was purified from expression media using a C-torminal hexa-histidine tag. The 

10 robotic purification is an adaptation of the protocol provided by Novagen for their 
HisBind resin. Briefly, a ISOuL (pL) settled volume of resin is poured into the 
wells of a 96-well lysate-binding plate, washed with 3 volumes of water and 
charged with 5 volumes of 50mM NiS04 and washed with 3 volumes of binding 
buflEer (5mM imidazole, 0.5M NaCl, 20mM Tris-HCL pH7.9). The protein 

15 expression media is diluted 3:2, media/PBS (6QmM P04, 16mM KCl, 822mM 
NaCl pH7.4) and loaded onto the columns. After draining, the columns are 
washed with 10 volumes of binding buffer and 6 volumes of wash buffer (30mM 
imidazole, 0.5M NaCl, 20mM Tris-HCl pH7.9) and the protein is eluted with 6 
volumes of elution buffer (IM imidazole, 0.5M NaCl, 20mM Tris-HCl pH7.9). 

20 Hie eluted glycoproteins are evaporated to dryness by lyophilyzation. 

Release of N-Iinked Glycans 

[0186] The glycans are released and separated from the glycoproteins by a 
modification of a previously reported method (Papac, et aL A. J. S. (1998) 

25 Glycobiology 8, 445-454). The wells of a 96-well MultiScreen IP (Immobilon-P 
membrane) plate (Millipore) are wetted with lOOuL of methanol, washed with 
3X150uL of water and 50uL of RCM buffer (8M urea, 360mM Tris, 3.2mM 
EDTA pH8.6), draining with gentle vacuum after each addition. The dried protein 
samples are dissolved in 30uL of RCM buffo: and transferred to the wells 

30 containing lOuL of RCM buffer. The wells are draioed and washed twice with 

RCM buffer. The proteins are reduced by addition of 60uL of 0. IM DTT in RCM 
buffer for Ihr at 37oC. The wells are washed three times with 300uL of water and 
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caiboxymeQiyiated by addition of 60uL of O.IM iodoacedc add for 30min in the 
dark at room temperature. Tht wells are again washed three times with water and 
the membranes blocked by the addition of lOOuL of 1% PVP 360 in water for Ihr 
at room temperature. The wells are drained and washed toee times with 300uL of 
5 water and deglycosylated by the addition of 30uL of lOmM NH4HC03 pH 8.3 
containing one milHunit of N-glycanase (Glyko). After 16 horns at 37oC, the 
solution containing the glycans was removed by centrifiigation and ev^orated to 
diyness. 

10 Matrix Assisted Laser Desorption Ionization Time of FUght Mass 
Spectrometry 

[0187] Molecular weights of the glycans were determined using a Voyager DE 
PRO linear MAIDI-TOF (AppUed Biosciences) mass spectrometer using delayed 
extraction. The dried glycans fiom each well were dissolved in 15uL of water and 
15 0.5uL spotted on stainless steel sample plates and mixed with O.SuL of S-DHB 
matrix (9mg/mL of dihydroxybenzoic add, Img/mL of 5-meflioxysahdlic acid in 
1:1 water/acetonitrile 0.1% TFA) and allowed to dry. 

10188] Ions were generated by irradiation with a pulsed nitrogen laser (337nm) 
with a 4ns pulse tune. The mstrument was operated in the delayed extraction mode 

20 with a 125ns delay and an accelerating voltage of 20kV. The grid voltage was 

93.00%, guide wire voltage was 0.10%, the internal pressure was less than 5 X 10- 
7 toir, and the low mass gate was 875Da. Spectra were generated &am the sum of 
100-200 laser pulses and acquired with a 2 GHz digitizer. Man5 oUgosacchaiide 
was used as an external molecular weight standard. All spectra were generated 

25 witii the instrument in the positive ion mode. The estimated mass accuracy of the 
spectra was 0.5%. 



Materials: 

[01891 MOPS, sodium cacodylate, mangauese chloride, UDP-galactose and 
30 CMP-N-acetyineuraminic acid were from Sigma, Samt Louis, MO. Trifluroacetic 
add (TFA) was from Sigma/Aldrich. Saint Louis, MO. Recombinant rat alpha-2,6- 
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sialyltransfetase ftom Spodoptera fiugiperda and beta-l,4-galactosyltransfea:ase 
from bovine miWc were from CalbiochCTi, San Diego, CA, 

jS-N-acetylliexosaminidase Digestion 
5 [0190] The glycans were released and separated from the glycoproteins by a 
modification of a previously reported method CPq)ac, et al. A. J. S. (1998) 
Glycobiology 8, 445-454). After the proteins were reduced and caiboxymethylated, 
and the membranes blocked, the wells were washed three time with water. The 
protein was deglycosylated by the addition of 30 ^1 of 10 mM NH4HCO3 pH 8.3 
10 containing one milhiinit of N-glycaaase (Glyko, Novato, CA). After 16 hr at 37^C, 
the solution containing tbe glycans was removed by centrifiigation and evaporated 
to dryness. The glycans were then dried in SC210A speed vac (Themio Savant, 
Halbrook, NY). The dried glycans were put in 50 mM NHUAc pH 5.0 at 37°C 
overnight and ImU of hexos (Glyko, Novate, CA) was added. 

15 

Galactosyltransferase Reaction 

[0191] Approximately 2mg of protem (r-K3:hPg [PBP6-5]) was purified by 
nickel-aflBnity chromatography, extensively dialyzed against 0.1% TFA, and 
lyophilized to dryness. The protein was redissolved in 150[iL of 50mM MOPS, 
20 20mM MaC12, pH7.4. After addition of 32.5}xg (533nmol) of UDP-galactose and 
4mU of p 1,4-galactosyltransferase, the sample was incubated at 37^ C for 18 
hours. The samples were then dialyzed against 0.1% TFA for analysis by MAIDI- 
TOF mass spectrometry. 

[0192] The spectrum of the protein reacted with galactosyltransferase showed an 
25 inorease m mass consistent with the addition of two galactose moieties when 

compared with the spectrum of a similar protein sample incubated without enzyme. 
Protein samples were next reduced, carboxymethylated and deglycosylated with 
PNGase F. The recovered N-glycans were analyzed by MALDI-TOF mass 
spectrometry. The mass of the predominant glycan from the galactosyltransferase 
30 reacted protein was greater than that of the control glycan by a mass consistent 
wifli flie addition of two galactose moieties (325.4 Da). 
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Sialyltransferase Reaction 

[0193] After resuspending the (galactosyltransferase reacted) piotdns in lOjiL of 
SOroM sodium cacodylate buffer pH6.0, 300itg (488Dmol) of CMP-N- 
acetylneuraminic acid (CMP-NANA) dissolved in 15nL of the same buffer, and 

5 SpL (2mU) of recombinant a-2,6 sialyltransferase were added. After incubation at 
37°C for 15 hours, an additional 200ng of CMP-NANA and ImU of 
sialjdtrahsferase were added. The protein samples were incubated for an additional 
8 hours and then dialyzed and analyzed by MALDI-TOF-MS as above. 
[0194] The spectrum of the glycoprotein reacted wifli sial>ltransfeTase showed an 

10 increase in mass when compared with &at of the starting material (Ihe protein after 
galactosyltransferase reaction). The N-glycans were released and analyzed as 
above. The increase in mass of the two ion-adducts of the predominant glycan was 
consistent with the addition of two sialic acid residues (580 and 583Da). 



15 EXAMPLE 3 

IdentificatioD, cloning and deletion of tiie 
ALG9 andALG 12 genes in P.pastoris 

[0195] Similar to Exaiiq)le 1, the ALG9p and ALG12 sequences, respectively 
20 ftom S. cerevisiae, Drosophila melanogaster. Homo sapiens, etc.. is aligned and 
regions of high homology are used to design degenerate primers. These primers 
are employed in a PGR reaction on genomic DNA fix)m the P. pastoris. The 
resulting initial PGR product is subcloned, sequenced and used to probe a Southon 
blot of genomic DNA bom P. pastoris with high stringency (Sambrook et al., 
25 1989). Hybridization is obsCTved. This Southern blot is used to map the genomic 
locL Genomic fragments are cloned by digesting genomic DNA and ligating those 
IBragments in the tqjpropriate size-range into pUC19 to create a P. pastoris 
subgenomic library. This snbgenomic library is transformed into E. colt and 
several hundred clones tested by colony PGR, using primers designed based on the 
30 sequence of the initial PGR product The clones containing flie predicted genes are 
sequenced and open reading frames identified. Primers for construction of an 
aIg9::NAI^ deletion allele, using a PGR overls?) method (Davidson et aL, 2002), 
are designed. The resulting deletion allele is tiansformed into two P.pastoris 
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Strains and NAT resistant colonies are selected. These colonies are sareened by 
PGR and transformants obtained in which the ALG9 OKF is replaced with the 
ochl::NAI^ mutant allele. See generally, Cipollo et al. Glycohiology 2002 
(12)11:749-762; Oiantret et aL Biol Chenu Jul. 12, 2002 (277)28:25815-25822; 
5 CipoUo et al. J. BioL Chem. Feb. 11, 2000 (275)6:4267-4277; Burda et aL Proc. 
Natl Acad, Set U.S.A July 1996 (93):7160-7165; Karaogluet al. Biochemistry 
2001, 40, 12193-12206; Grimme et al. 7. Biol Ghent July 20, 2001 
(276)29:27731-27739; Verostek et al. J. BioL Chem. June 5, 1993 (268)16:12095- 
12103; Huffaker et al. Proc, Natl Acad. Sci. U.S.A Dec. 1983 (80):7466-7470. 

10 

EXAMPLE 4 

Identification, cloning and expression of Alpha 1,2-3 Mannosidase From 

Xanthomonas Manihotis 

15 

[0196] The alpha 1 ,2-3 Mannosidase from JQxnthomonas Manihotis has two 
activities: an alpha-1,2 and an alpha-1, 3 mannosidase. The methods of the 
invention may also use two indepaadent mannosidases having these activities, 
which may be similarly identified and cloned from a selected organism of intCTest. 

20 [0197] As described by Landiy et al., alpha-mannosidases can be purified from 
Xanthomonas sp., such as Xanthomonas manihotis. X. manihotis can be purchased 
from the American Type Culture Collection (ATCC catalog number 49764) 
(Xanthomonas axonopodis Stair and Garces pafhovar manihotis dqposited as 
Xanthomonas manihotis (Arfhaud-Berthet) Stair). Enzymes are purified from 

25 crude cell-extracts as previously described (Wong-Madden, S.T. and Landiy, D. 
(1995) Purification and characterization of novel glycosidases from the bacterial 
genus Xanthomonas; and Landry, D. US Patent US 6,300,1 13 Bl Isolation and 
composition of novel Glycosidases). After purification of the mannosidase, one of 
several methods are used to obtaia peptide sequence tags (see, e.g., W. Quadroni 

30 M et al. (2000). A method for tihie chemical generation of N-temiinal peptide 
sequence tags for r^id protein identification. Atial Chem (2000) Mar 
1;72(5):1006-14; Wilkins MR et al. Rapid protein identification using N-tenninal 
"sequence tag" and amiao acid analysis. Biochem Biophys Res Coiomun. (1996) 
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Apr 25;221(3):609-13; and Tsugita A. (1987) Developments in protein 
microsequencing. Adv Biophys (1987) 23:81-113). 

[01981 Sequaice tags generated using a mediod above are iJien used to generate 
sets of degenerate primers using methods well-known to lie skilled worker. 

5 Degenerate primers are used to prime DNA amplification in polymerase chain 
reactions (e.g., using Taq polymerase kits according to manufacturers' 
instructions) to amplify DNA fragments. The amplified DNA fragments are used 
as probes to isolate DNA molecules comprising the gene encoding a desired 
mannosidase. e.g., using standard Southern DNA hybridization techniques to 

10 identify and isolate (clone) genomic pieces encoding the enzyme of interest The 
genomic DNA molecules are sequenced and putative open reading frames aad 
coding sequences are identified. A suitable expression construct encoding for the 
glycosidase of interest can then be generated using methods described herein and 
well-known in the art 

15 [0199] Nucleic add fragments comprising sequences csicoding alpha 1,2-3 
mannosidase activity (or catalytically active fragments hereof) are cloned into 
appropriate expression vectors for expression, and preferably targeted expression, 
of these activities in an ^)propriate host cell according to the methods set forth 
herein. 



20 



EXAMPLES 

Identification, cloning and expression of the ALG6 gene in P.pastoris 

[02001 Similar to Example 1 , the ALG6p sequences from S. cerevisiae. 
Drosophila melanogaster, Homo sapiens etc.. are aUgned and regions of higji 

25 homology are used to design degenerate primers. These primers are employed in a 
PGR reaction on genomic DNA from the P. pastoris. The resulting initial PGR 
product is subcloned, sequenced and used to probe a SouJhem blot of genomic 
DNA from P. pastoris with high stringency (Sambfook et al, 1989). Hybridization 
is observed. This Southern blot is used to map the genomic loci. Genomic 

30 fragments are cloned by digesting genomic DNA and Ugating those fragments in 
the appropriate size-range into pUC19 to create a P. pastoris subgenomic library. 
This subgenomic library is transformed into E. coli and several hundred clones are 
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tested by colony PGR, using primers designed based on the sequence of the initial 
PC31 product The clones containing the predicted genes are sequenced and open 
reading frames ideaatified. Prim^ for constraction of an alg6::NAI^ deletion 
allele, using a PGR overlap method (Davidson et al, 2002), are designed and the 
5 resulting deletion allele is transformed into two P. pastoris strains and NAT 

resistant colonies selected. These colonies are screened by PGR and transfoimants 
are obtained in which the ALG6 ORF is replaced with the ochI::NAl^ mutant 
aUele. See, e.g., Imbach et al. Proc. Natl Acad Sci. U.SA. June 1999 (96)6982- 
6987. 

10 [0201] Nucleic acid fragments comprising sequences encoding Alg6p (or 
catalyticaUy active fragments thereof) are cloned into appropriate expression 
vectors for expression, and preferably targeted expression, of these activities in an 
^propriate host cell according to the metiiods set forth hereia The cloned ALG6 
gene can be brou^t under the control of any suitable promoter to achieve 

15 overexpression- Even expression of the grae under the control of its own promoter 
is possible. Expression from multicopy plasmids will gmerate high levels of 
expression C*overexpiession"). 



EXAMPLE 6 

20 Cloning and Expression Of GnT m To Produce 

Bisecting GlcNAcs Which Boost Antibody Functionality 

A. Backgronnd 

[0202] The addition of an N-acetylglucosamine to the GlcNAc2Man3GlcNAc2 
25 stracture by N-acetylglucosaminyltransferases III yields a so-called bisected N- 
glycan (see Figure 3). This stracture has been implicated in greata: antibody- 
dependent cellular cytotoxicity (ADCC) (Umana et al. 1999). 
[0203] A host cell such as a yeast strain czpoble of producing glycoproteios with 
bisected N-glycans is engineered according to the invention, by ratroducing into 
30 the host cell a GnTIH activity. Preferably, the host cell is transfomied with a 

nucleic acid that encodes GnTHI (e.g., a mammalian such as the murine GnT EI 
shown in Fig. 32) or a domain thereof having enzymatic activity, optionally fused 
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to a heterologous ceU signal targeting i>q>tide (e.g., using tJie libraries and 
associated metbods of the inveutioD.) 

[0204] IgGs consist of two heavy-dxains (Vh, ChI, Ch2 and Ch3 in Figure 30), 
interconnected in the hinge region through three disulfide bridges, and two light 
5 chains (Vl, Cbin Figure 30). The light chams (domains Vl and Ct) are linked by 
another disulfide bridge to the CrI portion of the heavy chain and togetiier with the 
ChI and Vh fragment make up the so-caUed Fab region. Antigens bind to tiie 
tenninal portion of the Fab region. The Fc region of IgGs consists of the Ch3, the 
Ch2 and the hinge region and is responsible for ttie exertion of so-caUed effector 

ID functions (see below). 

[0205] The primary function of antibodies is binding to an antigen. However, 

unless binding to the antigen directly inactivates the antigen (such as in the case of 

bacterial toxins), mere binding is meaningless unless so-called effector-functions 

are triggered. Antibodies of the IgG subclass exert two major effector-functions: 

15 the activation ofthecomplemait system and induction of phagocytosis. The 
complranent syston consists of a complex group of serum proteins involved in 
contiolling inflammatory events, in flie activation of phagocytes and in the lytical 
destruction of cell membranes. Complement activation starts with bmding of the 
CI complex to tiie Fc portion of two IgGs in close proximity. CI consists of one 

20 molecule, Clq, and two molecules, Clr and Cls. Phagocytosis is initiated through . 
an interaction between the IgG's Fc fragment and Fc-gamma-receptors (FcyRI, H 
and EI in Figure 30). Fc receptors are primaiily ejqiressed on tiie surfece of 
effector cells of the immune system, in particular macrophages, monocytes, 
myeloid cells and dendritic cells. 

25 [0206] The Ch2 portion harbors a conserved N-glycosylation site at asparagine 

297 (Asp297). The Asp297 N-glycans are highly heterogeneous and are known to 
affect Fc receptor binding and complement activation. Only a minority (i.e., about 
15-20%) of IgGs bears a disialylated, and 3-10% have a monosialylated N-glycan 
(reviewed in Jeffaris, R., Glycosylation of human IgG Antibodies. BioPhann, 
30 2001). Intraestingly, the minimal N-glycan structure shown to be necessary for 
fiilly functional antibodies capable of complement activation and Fc receptor 
binding is a pentasacharide witii tenninal N-acetylglucosamme residues 
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(GlcNAciMans) (reviewed in Jefiferis, Glycosyladon of human IgG Antibodies. 
BioPhann, 2001). Antibodies wifti less than a GlcNAc2Man3 N-glycan or no N- 
glycosylation at Asp297 might still be able to bind an antigen but most likely will 
not activate the crucial downstream events such as phagocytosis and complement 
5 activation. In addition, antibodies with fongd-typeN-glycans attached 

will ia all hkelihood solicit an immune-response in a mammalian organism which 
will render that antibody useless as a therapeutic glycoprotein. 

B. Cloning And Expression Of GnTm 

10 The DNA fragment encodiag part of the mouse GnTHI protein lacking the TM 

domain is PGR amplified from murine (or othCT mammalian) genomic DNA using 
forward 5'-TCCTGGCGCGCCTTCCCGAGAGAACTGGCCTCCCTC~3' and 
5'-.AATTAATTAACCCTAGCCCTCCGCTGTATCCAACTTG-3' reversed 
primers. Those primers include AscI and PacI restriction sites that will be uSed for 

15 cloning into tiie vector suitable for the fusion with leader library. 

The nucleic acid and amino acid sequence of murine GnllU is shown in Fig, 32. 

C rinninfT nf immnnogloHnlin CTrAfilnyr sequences 

r02071 P rotocols for the cloning of the variable regions of antibodies, including 
20 primer sequences, have hem published previously. Sources of antibodies and 

encoding genes can be, among others, in vitro immunized human B cells (see, e.g., 

Borreback, C et al. (1988) Proa Natl. Acad, ScL USA 85, 3995-3999), periphal 

blood lymphocytes or single human B cells (see, e.g., Lagerkvist, A-C. et al. 

(1995) Biotechniques 18, 862-869; andTaness, P. et aL (1997) Hum. Immunol. 56, 
25 1 7-27) and transgenic mice containing human iimnunogjobulin loci, allowing the 

creation of hybridoma ceU-lines. 

[0208] Using standard recombinant DNA techniques, antibody-encoding nucleic 
acid sequences can be cloned. Sources for the genetic information encodiag 
immunoglobulins of interest are typically total RNA preparations from cells of 
30 interest, such as blood lymphocytes or hybridoma cell lines. For example, by 
employing a PGR based protocol with specific primers, variable regions can be 
cloned via revorse transcription initiated from a sequence-specific primer 
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hybridi2dng to fte IgG ChI domain site and a second primer encoding amino acids 
111-118 of llie murine kapp? constant region. The Vh and Vr encodingcDNAs 
will then be am?)lified as previously published (see, e.g., Graziano, R-F. et aL 
(1995) J Immunol 155(10): p. 4996-5002; Welsdiof; M. et al. (1995)/. Immunol. 
5 Methods 179, 203-214; and Orlandi, R. et al. (1988) Proc. Natl Acad. Sd. USA 86: 
3833). Cloning procedures for whole immunoglobulins (heavy and light chains 
have also been published (see, e.g., Buckel, P. et al. (1987) Gefie 51:13-19; 
Recinos A 3"" et al. (1994) Gene 149: 385-386; (1995) Gene Jun 9;158(2):311-2; 
and Recinos A 3^" et al. (1994) Gene Nov 18;149(2):385-6). Additional protocols 
10 for the cloning and generation of antibody fragment and antibody expression 
constructs have been described in Antibody Engineering, R. Kontermami and S. 
Dflbel (2001), Editors, Springer Verlag: Berlin Heidelberg New York. 
[0209] Fungal expression plasmids encoding heavy and li^t chain of 
immunoglobulins have been described (see, e.g., Abdel-Salam, HA. et al. (2001) 
15 Appl. Microbiol. BiotechnoL 56: 157-164; and Ogunjimi, A.A. et al. (1999) 

Biotechnology Letters 21: 561-567). One can thus generate expression plasmids 
harboring the constant regions of immunoglobulins. To fecilitate the cloning of 
variable regions into these expression vectors, suitable restriction sites can be 
placed in close proximity to the termini of the variable regions. Hie constant 
20 regions can be constructed in such a way that the variable regions can be easily in- 
ftame fused to diem by a simple restriction-digest / ligation experiment. Figure 31 
shows a schematic overview of such an expression constnict, designed in a very 
modular way, allowing easy exchange of promoters, transcriptional tenninalors, 
integration targeting domains and even selection maricers. 
25 [0210] As shown in Figure 31, Vl as well as Vh domains of choice can be easily 
cloned in-frame with Cl and the Ch regions, respectively. Initial integration is 
targeted to the P. pastoris AOX locus (or homologous locus in another fungal cell) 
and the methanol-indudble AOX promoter will drive expression. Alternatively, 
any otha- desired constitutive or indudble promoter cassette may be used. Thus, if 
30 desired, liie 5'AOX and 3*A0X regions as well as transcriptional temiinator (TT) 
fragments can be easily replaced with different TT, promoter and integration 
targeting domains to optimize expression. Initially the alpha-fector secretion 
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signal with the standard KEX protease site is raiployed to facilitate secretion of 
heavy and light chains. The properties of the expression vector may be further 
refined using standard techniques. 

[0211] An Ig expression vector such as the one described above is introduced 
5 into a host cell of the invention fbst ^presses GnTDI, preferably in the Golgi 

apparatus of the host cell. The Ig molecules expressed in such a host ceU comprise 
N-glycans haviug bisecting GlcNAcs. 

EXAMPLE 7 

aoning and expression of GnT-IV (in>P-GlcNAc:alpha-13-D -mannoside 
10 beta-l54-N-Acetylglucosaminyltransferase IV) and 

GnT-V (beta 1-6-N-acetylgIucosaminyltransferase) 

[0212] GnTIV-encodingcDNAs were isolated from bovine and human cells 
(MinowaJil.T. et aL (1998) /. BioL ChenL 273 (19), 11556-11562; and 

15 Yoshida,A.etal. (1999) G/3;co6w/agy 9 (3), 303-310. The DNA fragments 

encoding fidl length and a part of the human GnT-IV protein (Kgure 33) lacking 
the TM domain are PGR ampUfied from the cDNA Ubracy using forward 
5'-AATGAGATGAG(5CTCCGCAATGGAACTG-3', 
5'-CTGATTGCTTATCAACGAGAATTCCTTG-3\ and reverse 

20 5'-TGTTGGTTTCTCAGATGATCAGTTGGTG-3'primers, respectively. 
The resulting PGR products are cloned and sequmced. 

[0213] Similarly, genes encoding GnT-V protein have been isolated from several 
mammalian species, including mouse. (See, e.g., Alverez, KL et al. Glycobiology 
12 (7), 389-394 (2002)). The DNA fragments encoding full length and a part of 
25 the mouse GnT-V protein (Figure 34) lacking the TM domain are PGR amplified 
from the cDNA library using forward 5'- 
AGAGAGAGATGGCTTTCTmCTCCCTGG-3', 5'- 
AAATCAAGTGGATGAAGGACATGT(j(}C-3', and reverse 
5'-A(jCGATGCTATAGGCAGTCTTTGCAGAG-3'primers, respectively. The 

30 resulting PGR products are cloned and sequenced, 

[0214] Nucleic acid fragments comprising sequences encoding GnT IV or V (or 
catalytically active fragments thereoJO are cloned into Expropriate expression 
vectors for expression, and preferably targeted expression, of these activities in an 
^propriate host cell according to the methods set fortJi herein- 
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Whatis Claiinedis : 

1 . A method for producing a hxanan-like glycoprotein in a non-hiunan 
eukaryotic host cell comprising the step of diminishing or depleting the activity of 
one or more enzymes in the host cell that transfers a sugar residue to the 1,6 aim of 

5 a lipid-linked oligosaccharide structure. 

2. The method of claim 1 , further comprising the step of introducing into the 
host cell at least one glycosidase activity. 

3 . The method of claim 2, wherein at least one glycosidase activity is a 
mannosidase activity. 

10 4. The method of claim 1, further comprising produciug an N-glycan. 

5. The method of claim 4, wherein the N-giycan has a GlcNAcManxGlcNAc2 
structure wherein X is 3, 4 or 5. 

6. The method of claim 5, further comprising the step of expressing within the 
host cell one or more enzyme activities, selected from glycosidase and 

15 glycosyltransferase activities, to produce a GlcNAciManaGlcNAca structure. 

7. The method of claim 6, wherein the activity is selected from a-1,2 
mannosidase, g^1,3 mannosidase and GnTlI activities. 

8. The method of claim 1 , whereiu at least one diminished or depleted enzyme 
is selected from the group consisting of an enzyme having dolichyl-P- 

20 Man:Man5GlcNAc2-PP-dolichyl alpha- 1,3 mannosyltransferase activity; an 
enzyme having dolichyl-P-ManiManeGlcNAci-PPMiohchyl alpha-1,2 
mannosyltransferase activity and an enzyme having dolichyl-P- 
ManiManyGlcNAci-PP-dolichyl alpha- 1,6 mannosyltransferase activity. 
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9. The method of claim 1, wh^ein the diminished or depleted enzyme has 
25 doUchyl-P-MairMansGlcNAcz-PP-doUchyl alpha-1,3 mannosyltransferase 

activity. 

10. The mefliod of claim 1, wherein the enzyme is diminished or depleted by 
mutation of a host cell gene encoding flie enzymatic activity. 

11. The method of claim 10, wherein the mutation is a partial or total deletion 
30 of a host cell gene encoding the enzymatic activity. 

12. The method of claun 1, wherein the glycoprotein comprises 7^-glycans 
having seven or fewer mannose residues. 

13. The method of claim 1, wherein the glycoprotein comprises 7V-glycans 
having three or few^ mannose residues. 

35 • 14. The method of claim 1, wherein the glycoprotein comprises one or more 
sugais selected Jtom the group consisting of galactose, GlcNAc, sialic acid, and 
fucose. 

15. The method of claim 1, wherein the glycoprotein comprises at least one 
oligosaccharide branch comprising the stmcture NeuNAc-Gal-GlcNAc-Man. 

40 16. The method of claim 1 , wherein the host is a lower eukaryotic cell. 

17. The method of claim 1, whereia the host cell is selected from the group 
consisting of Pichia pastoris, Pichia finlandica, Pichia trehalophila, Pichia 
koclamae, Pichia membranaefaciens, Pichia opuntiae, Pichia thermotolerans, 
Pichia salictaria, Pichia guercuimt, Pichia pijperi, Pichia stiptis, Pichia 

45 methanolica, Pichia sp., Saccharomyces cerevisiae, Saccharomyces sp.. Hansenula 
pofyinoipha, Klvyveromyces sp., Candida albicans, Aspergillus nidulans, 
Aspergillus niger, Aspergillus oryzae, Trichoderma reesei, Chrysosporium 
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lucknowense, Fusarium sp., Fusariwn gramineum, Fusarium venenatum and 
Neurospora crassa, 

50 18. The method of claim 1 , wherein the host cell is further deficient in 
expression of initiating qj-1,6 mannosyltransferase activity. 

19. The method of claim 1 8, wherein the host cell is an OCHl mutant of P. 
pastoris, 

20. The method of claim 1, wherein the host cell expresses GnTI and UDP- 
55 GlcNAc transporter activities. 

21. The mefliod of claim 1, wherein the host cell expresses a UDP- or GDP- 
specific diphosphatase activity. 

22. The method of claim 1, finfher comprising the step of isolating the 
glycoprotein fiom the host 

60 23. The method of claim 22, fiirther comprising the step of sxibjecting the 
isolated glycoprotein to at least one fiirther glycosylation reaction in vitro, 
subsequent to its isolation firom the host 

24. The method of claim 1, fiirther comprising the step of introducing into the 
host a nucleic acid molecule encoding one or more enzymes involved in the 

65 production of GlcNAcMan3GlcNAc2 or GlcNAc2Man3GlcNAc2. 

25. The method of claim 24, wherein at least one of the enzymes has 
mannosidase activity. 

26. ITie method of claim 25, wherein flie enzyme has an Qf-l,2-maimosidase 
activity and is derived firom mouse, human, Lepidopteray Aspergillus nidulans, C. 

70 elegans, D, melanogaster, or Bacillus sp. 
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27. The meOiod of claim 25, wherein Ihe enzyme has an Q5-l,3-mannosidase 
activity. 

28. The method of claim 24, wherem at least one enzyme has 
gtycosyltransferase activity. 

75 29. The method of claim 28, wherein tiie glycosyltransfearase activity is selected 
from the groxip consisting of GnTI and GnTII. 

30. The method of claim 24, wherein at least one enzyme is localized by 
forming a fusion protein between a catalytic domain of the enzyme and a cellular 
targeting sdgnal peptide. 
80 31. The method of claim 30, wherein the fusion protein is encoded by at least 
one genetic construct formed by the in-frame ligation of a DNA fragment encoding 
a cellular targeting signal peptide with a DNA fragmeat encoding a glycosylation 
enzyme or catalyticaUy active fragment thereof. 

32. The method of claim 31, wherein the encoded targeting signal peptide is 
85 derived from a member of the group consisting of mannosyltransferases, 

diphosphotases, proteases, GnT I, GnT n, GnT m, GnT IV, GnT V, GnT ^^ 

GalT,Fr,andST. 

33. The method of claim 31, wherein the catalytic domain encodes a 
glycosidase or glycosyltransferase that is derived from a member of the group 

90 consisting of GnT I, GnTII,GnT in, GnT IV, GnTV,GnTV],GalT, 

Fucosyltransferase and ST, and wherein the catalytic domain has a pH optimum 
within 1.4 pH units of the average pH optimum of other representative enzymes in 
the organelle in which the enzyme is localized, or has optimal activity at a pH 
between 5.1 and 8.0. 
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95 34. The method of claim 31, wharein the nucleic add molecule encodes one or 
more enzymes selected from the group consisting of UDP-GlcNAc transferase, 
UDP-galactosyltransforase, GDP-fucosyltransferase, CMP-sialyltransfarase, UDP- 
GlcNAc transporter, UDP-galactose transporter, GDP-fucose transporter, CMP- 
sialic acid transporter, and nucleotide diphosphatases. 

100 35. Themefhodof claim 31, wherein the host expresses GnTIandUDP- 
GlcNAc transporter activities. 

36. The method of claim 3 1 , wherein the host expresses a UDP- or GDP- 
specific diphosphatase activity. 

37. The method of claim 1 , further comprising the st^ of introducing into a 
105 host that is deficient in dolichyl-P-Man:Man5GlcNAc2-"PP-dolichyi alpha-1,3 

mannosyltransferase activity a nucleic acid molecule encoding one or more 
enzymes for production of a GlcNAcMan4GlcNAc2 carbohydrate structure. 

38. The mefliod of claim 1 , furfhra: comprismg the step of introducing into a 
host that is deficient in doUchyl-P-Man:Man6GlcNAc2-PP-dolichyl alpha-1,2 

110 mannosyltransferase or dolichyl-P-Man:Man7GlcNAc2-PP-doUchyl alpha-1,6 
mannosyltransferase activity a nucleic acid molecule encoding one or more 
enzymes for production of a GlcNAcMmiGlcNAca carbohydrate structure. 

39. The method of claim 37 or 38, wherein the nucleic acid molecule encodes 
at least one enzyme selected firom the group consisting of an a- 1,2 mannosidase, 

1 15 UDP GlcNAc transporter and GnTl. 

40. The method of claim 39, further comprising the step of introducing into the 
deficient host cell a nucleic acid molecule encodmg an ohl,3 or an 05-l,2/o!-l,3 



wo 03/056914 PCT/DS02/41510 



mannosidase activity for the conversion of the GlcNAciMan4GlcNAc2 structure to 
a GlcNAciManaGlcNAca structure. 
120 41. The method of claim 1, further comprising the step of introducing into the 
host a nucleic acid molecule encoding one or more enzymes for production of a 
GlcNAc2Man3GlcNAc2 carbohydrate structure. 

42. The method of claim 41, v^rherein at least one enzyme is GnTEL 

43. The method of claim 1, further comprising the step of introducing into the 
125 host cell at least one nucleic acid molecule mcoding at least one mamTnali ^^n 

glycosylation enzjone selected fix^m the group consisting of a glycosyltransferase, 
fucosjdtransferase, glactosyltransferase, N-acetylgalactosamin>4transferase, N- 
acetylglycosaminyltransferase and sulfotransferase. 

44. The method of claim 1, comprising the stqp of transforming host cells with 
130 a DNA Kbrary to produce a genetically mixed cell population expressing at least 

one glycosylation enzyme derived from the library, wherein the Kbrary comprises 
at least two different genetic constructs, at least one of which comprises a DNA 
fragment encoding a cellular targeting signal peptide Hgated in-frame with a DNA 
fragment encoding a glycosylation enzyme or catalytically active fragment thereof. 
135 45. A host cell produced by the method of claim 1 or 44, 

46. A human-like glycoprotein produced by the method of claim 1 or 44. 

47. A nucleic acid molecule comprisiog or consisting of at least forty-five 
consecutive nucleotide residues of Fig. 6 (P, pastoris ALG 3 gene). 

48. A vector comprising a nucleic acid molecule of claim 47. 
140 49. A host cell comprismg a nucleic acid molecule of claim 47. 
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50. A P.pastoris cell in which the sequences of Fig. 6 (P. pastoris ALG 3 
gene), are mutated whereby the glycosylation pattern of the cell is altered 

51. A method to enhance the degree of glucosylation of lipid-linked 
oligosaccharides comprising the step of increasing alpha-1,3 g^ucosyltransfer^se 

145 activity in a host cell. 

52. A method to enhance the degree of glucosylation of lipid-linked 
ohgosaccharides comprising decreasing the substrate specificity of oligosaccharyl 
transferase activity in a host cell. 

53. A method for producing in a non-mammalian host cell an immunoglobuhn 
1 50 polypeptide having an N-gJycan comprising a bisecting GlcNAc, the method 

comprising the step of expressing in the host cell a GnTIH activity. 

54. A non-mammalian host cell that produces an immunoglobulin having an N- 
glycan comprising a bisecting GlcNAc. 

55. An immunoglobulin produced by the host cell of claim 54. 

155 . 56. A method for producing in a non-human host cell a polypeptide having an 
N-glycan comprising a bisecting GlcNAc, the mefliod comprising tiie step of 
expressing in the host cell a GnTEI activity. 

57. A non-human host cell that produces a polypeptide having an N-glycan 
comprising a bisecting GlcNAc. 

160 58. A polypeptide produced by tiie host cell of claim 57. 

59. A melliod for producing a human-like glycoprotein in a non-human 
eukaryotic host cell comprising the step of diminishing or depleting from the host 
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cell an alg gene activity and introducmg into the host cell at least one ^ycosidase 
activity. 

165 60. A method for producing a human-like glycoprotein having an N-glycan 
comprising at least two GlcNAcs attached to a trimannose core. 
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FIGURE 4 (sheet 1) 



ALG3 Blast 05-22-01 

Sequences producing significant alignments: (bits) Value 



gi 
gi 
gi 
gi 



0 



5B6444|sp1P38179|AIjG3 YEAST DOLICHYL-P-MAN :MAN (5) GLCMAC ( . . . 797 0 

ALG3_HUMAN D0LICHyii-P-MAN:MAN{5) GLC33AC. . .173 7e-43 

NT56_DROVI LETHAL (2) NEIOIBOUR OF TID P...145 3e-34 

ju^^^^^.isp NT56_DROMS LETHAL (2 ) NEIGHBOUR OF TID P. ..121 3e-27 

10720153 I Bp|P82149 I NT53_DROME LETHAL (2) NEIGHBOUR OF TID ...121 5e-27 

17079B2|sp|P40989|GLS2_YEAST 1, 3 -BETA-GLUCAN SYNTHASE CX> . 32 2.8 

1346146 sp P3B63l|GLSl_YEAST 1, 3-BETA-GLUCAN SYNTHASE CO... 31 6.6 



3024226 I sp 
3024221 I sp 
3024222 I sp 



Q926B5 
Q24332 
Q27333 



Alignments 

Yeast 



>gi 1 586444 1 sp I P38179 1 ALG3„YEAST DOLICHYL-P- 
MAN:MAN(5)GLCNAC(2)«PP.DOLICHYL MANNOSYLTRANSFERASE 

(DOL-P-MAN DEPENDENT ALPHA (1-3) -MANNOSYLTRANSFERASE) 
(HM-1 KILLER -TOXIN RESISTANCE PROTEIN) 
Length = 458 

Score = 797 bits (2059), Expect « 0.0 

Identities = 422/458 (92%), Positives = 422/458 (92%) 

Query: -1 MBGEQSPQGEKSLQRKQFVRPPLDLWQDLKDGVRYVIFDCRANLIVMPLLILFESMLCKJ 60 

MEGEQSPQGEKSLQRKQPVRPPLDLWQDLTOGVRYVIFDCRANLIVMPLLILFE^ 
Sbjct: 1 MEGEOSPQGEKSLQRKQFVRPPLDLWQDLKDGVRyVIFrOU^IV^ 60 

Query: 61 IIKKVAYTEIDYKAYMEQIEMIQLDCMiDYSQVSGGTGPLVYPAGHVLIYKW^ 120 

IIKKVAYTEIDYKAYMEQIEMIQIiDGMIDYSQVSGGTGPLVyPAGHVLIYK^^ 
Sbjct: 61 IIKKVAYTEIDYKAYMEQIEMIQIiDGMLDYSQVSGGTGPLVYPACfflTO 120 

Query: 121 DHVERGQVFFRYLYLLTLAIOMACYYLLHLPPWCVVIJW^ 180 

DHVERGQVFFRYLYLLTLALQMACyYLimPPWC\n/LACLSKRIiHSIYV^^ 
Sbjct: 121 DHVERGO\^FRYLYLLTLALQMACYYLLHLPPWCVVIiACM 180 

Query: 181 FMWTVLGAIVASROIQRPKLKKSIJ^VISATySMAVSIKMNAiaiYFPAmiSLFII^^ 240 

FMVVTVLGATVTASRCHQRPKIjKKSLALVISATYSMAVSIKMNAtt 
Sbjct: 181 FMVVTVLGAIVASRCHQRPPjKKSLALVISATySMAVSIKMNALLYFPA^ 240 

Query: 241 NVILTLLDLVAMIAWQVAVAVPFLRSFPQQYLHCAFNFGRKEMYQWSINWQMMDEEAFTO 300 

NVILTLLDLVAmAWQVAVAVPFLRSFPQQYLHCAFNFGRKFMYQWSINWQmDEEAFiro 
Sbjct: 241 N\n:LTLLDLVAI«aAWQVAVAVPFIJlSFPQQYLHCAFNFGRKFWQWSINWQ^ 300 

Query: 301 KRJXXXXXXXXXXXXXXXFVTRYPRILPDLWSSLCHPIJaCNAVLNAOT 360 

KRF FVTRYPRILPDLWSSLCHPLRKNAVLNANPAKTIPFVLIASN 
Sbjct: 301 KRFHLALLISHLIALTTLFVTRYPRILPDLWSSLCHPLRKNAVLNANPACT 360 

Query: 361 FIGVLFSRSLHYQFLSWYHWTLPILIFWSGMPFFVGPIWYVLHEWCWNSYPPNSQXXXXX 420 

FIGVLFSRSLHYQFLSWYHWTLPILIFWSC3^PFFVGPIWYVLHEWCWNSYPPNSQ 
Sbjct: 361 FIGVLFSRSLHYQFLSWYHWTLPILIFWSGMPFFVGPIWYVLHEWCWNSYPPNSQASTLL 420 

Query: 421 XXXXXXXXXYXXXyTnCSGSVALAKSHLRTTSSMEKKLN 458 

SGSVALAKSHLRTTSSMEKKLN 
Sbjct: 421 IALNTVLLLLLALTQ^lSGSVALAKSHLRTTSS^IEKKLN 458 
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FIGURE 4 (sheet 2) 

Htiman 

>gil3024226|splQ92685|ALG3,HDMAN IX^LICm^-P-MAN:MAN (5) GLCmC (2) -PP-JDOLICHYL 

MANNOSYLTRANSFERASE ^ , , 

(DOL-P-MRN DEPEHDENT ALPHA(l-3 ) -MANNOSYLTRANSFEEASE) 

(NOTSS-IjIKE PROTEIN) 

Length » 436 
Score = 173 bits (439) , Eacpect » 7e-43 

Identities = 133/396 (33%), Positives = 195/396 (48%), Gaps = 28/396 (7%) 

Ouerv- 26 WQDLKDGVRYVIFDCRANLIVMPnLIIiFESMLCKIII 85 

^' WQ+ R ++ + R Ii+V L I* E + +1 +VAyTEID+KAYM ++E + ++ 

Sbjct: 29 WQER Ri^JiREPRYTLLVAACIiaAEVGITF^ 83 

Ouery- 86 CMJ3YSQVSGGTGPLVyPAGHVLIYKMMm.TEGMDHVERG^ 145 

G DY+Q+ G TGPLVYPAG V 1+ +Y+ T 4- Q P LYL XL L Y 

Sbjct: 84 GTYDYTQIiQGDTGPLVYPAGFNnriFMGLYYATSRGTDIRMAQNIFAVLYIiAT^ 143 

Query: 146 Y-LimPPWC-VVIAaiSKRIiHSIYVLRLFND 203 

+ +PP+ + C S R+HSI+VLRLFND + + +Ii + QR — " 

Sbjct: 144 HCyrCKVPPFVFFFMCCASYRVHSIFVLRLFNDP VAMVLLFLSINLLIiAQRWGWG- 197 

Query- 204 SLALVISATYSl^VSIKMNALLYFPAMMISLFIIiND^^ 263 

+S+AVS+KMN LL+ P ++ L • L L + A + QV + +PF 

Sbjct: 198 -CCFFSIiAVSVKMNVLLFAPGIiLFLLLTQFGFRGALPKIX^ 249 

Query: 264 IJlSFPQQYIiHCAFfcmSRKEMYQWSINWQMmEEAF^ 323 

L P YL +F+ GR+P++ W++NW+ + E F + F + R+ 

Sbjct: 250 liENPSGYIiSRSFDIiGRQFLFHWTVNWRFIiPEALFI^^ 309 

Query- 324 PRILPDLWSSLOIPriRKNAVliNANPAKTIPFVIjIASNFI^ 3 83 

R +SLP++ Ili SNFIG+ FSRSLHYQF WY TLP 

Sbjct: 310 HRTGESILSLLRDPSKRKVPPQPLTPNQrVSTLFTSNFIGICFSRSIiOT 369 

Query- 384 ILIF WSOIPFFVGPIWYVLHEWCWNSYPPNS 414 

L++ W + + + E WN+YP S 

Sbjct: 370 YLLWAl^ARWLTHIiRIjLV]jGLI--EIiSWNTyPSTS 403 

Drosophila Vi 

>gi|302422l|sp|Q24332|NT56_DROVI LETHAL ( 2 ) NEIGHBOUR OF TID PROTEIN (NDT58) 
Length = 526 

Score = 145 bits (366) , E3q>ect » 3e-34 

Identities = 103/273 (37%), Positives = 157/273 (56%), Gaps = 17/273 (6%) 

Query: 33 VRYVIFIXaU^IVMPLLILFESKLOailKKVAYTEIDYKAY^ 92 

++Y+ F+ A IV L++L E+++ ++I++V yTEID+KAYM++ E L+G +YS 
Sbjct: 34 IKn»APEPAALPIVSVLIVLAEAVINVLVIQRVPYTEIDWKAYMQECEGF-LN^^ 92 

Query: 93 VSGGTGPLVYPAGHVLIYKMFOTILTEGMDHVERGQWFRYLYIiLTLALQMACYY^ 151 

+ G TGPLVYPA V lY +Y+LT +V Q F +YLL + L + Y +P 
Sbjct: 93 LRGDTGPLVYPAAFVYTYSGLYYLTGQGTNVRLAQYIFACIYLLQMCL^^^ 152 

Query: 152 PWCVVLAOi-SKRLHSIYVLRLFNDCFTTLFMVVTVI^AIVASR 210 

P+ +VL+ S R+HSIYVLRLFND L +L A + QR L S 
Sbjct: 153 PYVLVLSAFTSYRIHSIYVLRLFNDPVAIL LLYAALNLFLDQRWTLG S 200 

Query: 211 ATYS^IAVSIKMNALLyFPAMMISLFIIa^>ANVILTLLDLVAMIM 270 

YS+AV +KMN + A + LF L + V+ TL+ L Q+ + PFLR+ P + 

Sbjct: 201 ICYSLAVGVKMN---ILLFAPALLLFYLANLGVLRTLVQLTiaiVLQLFI^ 258 
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FIGURE 4 (sheet 3) 

Query: 271 yLHCAFNPGRKFMYQWSINWQMMDEEAFNDKRF 303 

YIj +F+ GR F ++W++N++ + +E F + F 
Sbjct! 259 YIjRGSFDIiGRIFEHKWTVNYRFIjSKEIiFEQREF 291 

Score = 53.3 bits (127), Expect = le-06 

Identities = 31/62 (50%) , Positives = 41/62 (66%) , Gaps = 6/62 (9%) 

Query- 352 IPFVLIASNFIGVLPSRSLHYQFLSWYHWTLPILIFWSGMPFFVGPIWTVLH-^ 409 

+PF L NFIGV +RSLHYQF W£ +IiP L+ WS P+ +G + +L E+CWN+ 
Sbjct: 412 LPFFL--CaTFIGVAaU^IiHyQFYIWyFHSLPYLV-WS-TPYSI*GVRyiiII^IIEY 467 

Query: 410 YP 411 
YP 

Sbjct: 468 TP 469 

Drosophila melainogaster 

>gi|3024222lsp|Q27333|»TS6_DRC»4E LETHAL (2) NEIGHBOUR OF TID PROTEIN (NOT56) 
(N0T45) 

Length = 510 
score » 121 bits (305), Expect = 3e-27 

Identities « 96/272 (35%), Positives = 154/272 (56%), Gaps = 17/272 (6%) 

RYVIFDC3UU3LIVMPLLILFESMLCraiIKK^^^ 93 
+Y++ + A IV ++L E ++I++V YTEID+ AYM++ E L+G +YS + 

KYLLLEPAALPIVGLFVIiLAELVINVWIQRVPYTEIDWVAY>K3ECEGF-L^ 94 

SGGTQPLVYPAGHVLIYKMMYWLTEtaCJHVERGQVFFRYLY^ 152 

G TGPLVYPA V lY +Y++T +V Q F +YLL LAL + Y +PP 
RGDTGPLVYPAAFVYIYSALYYVTSHGTNVRIJMSYIFAGIYLLQIiALVIJ^ 154 

WCVVLACL-SraOjHSIYVXJlLFNDCFTTLFM^^ 211 
+ +VL+ S R+HSIYVLRLPND + V +L A + +R L S 
YVLVLSAFTSYRIHSIYVLRLEHDP VAVLLLYAALNLFLDRRWTL6 ST 202 

TYSMAVSIIO^ALLYFPAMMISLFILNnANVILTLLDLVAMIAW^^ 271 

+S+AV +KMN + A + iiF L + ++ T+L L Q+ + PFL P +Y 

FFSLAVGVKMM--ILLFAPALLLFYLANIiGLLRTIIjQIiAVCXOTQL^^ 260 

LHCAFNFGRKFMYQWSINWQMMDEEAFNDKRF 303 
L +F+ GR F ++W++N++ + + F ++ F 
LRGSFDLGRIFEHKWTVNYRFLSRDVFENRTF 292 

Score = 49.4 bits (117), Expect = 2e-05 

Identities = 27/60 (45%), Positives = 35/60 (58%), Gaps = 2/60 (3%) 

Query: 352 IPFVLIASNFIGVLFSRSLHYQFLSWYHWTLPILIFWSGMPFFVGPIVmfLHEW 411 

+PF L N +GV SRSLHYQF WY +LP L + + V + L E+CWN+YP 

Sbjct: 407 LPFFL--CaiLVGVACSRSLHyQFYVWYFHSLPYLAWSTPYSIiGVRCLIIiGLIEYCWin:T 464 



Query: 


34 


Sbjct: 


36 


Query: 


94 


Sbjct: 


95 


Query: 


153 


Sbjct: 


155 


Query: 


212 


Sbjct: 


203 


Query: 


272 


Sbjct: 


261 
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FIGURE 4 (sheet 4) 

Matrix: BIiOSUM62 

Gap Penalties: Existence: 11, Extension: 1 

Number of Hits to DB: 288B3317 

Ninnber of Sequences: 96469 

Number of extensions: 1107545 

Niimber of successful extensions: 2870 

Number of sequences better than 10.0: 16 

Number of HSP*b better than 10.0 without gapping: 5 

Number of HSP's successfully gapped in prelim test: 11 

Number of HSP»s that attempted gapping in prelim test: 2839 

Number of HSP ■ s gained (non-prelim) : 23 

length of query: 458 
length of database: 35,174,128 
effective HSP length: 45 
effective length of query: 413 
effective length of database: 3 0,833,023 
effective search space: 12734038499 
effective search space used: 12734038499 
T: 11 
A: 40 

XI: 15 ( 7.1 bits) 
X2: 38 (14.6 bits) 
X3: 64 (24.7 bits) 
SI: 40 (21.8 bits) 
S2: 67 (30.4 bits) 
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FIGUR£5 



ATGGAAGGTGAACAGTCTCCGCAAGGTGAAAAGTCTCTGCAAAGGAAGC 

AATITGTCAGACCTCCGCTGGATCTGTGGCAGGATCrCAAGGACGGTGTG 

CGCTACGTGATCTTCGATTGTAGGGCCAATCTTATCGTTATGCCCCrnTG 

ATTTrGTTCGAAAGCATGCTGTGCAAGATTATCATTAAGAAGGTAGCTTAC 

ACAGAGATCGATTACAAGGCGTACATGGAGCAGATCGAGATGATTCAGCT 

CGATGGCATGCTGGACTACTCTCAGGTGAGTGGTGGAACGGGCCCGCTGG 

TGTATCCAGCAGGCCACGTCTTGATCTACAAGATGATGTACTGGCTAACA 

GAGGGAATGGACCACGTTGAGCGCGGGCAAGTGTmrCAGATACTTGTA 

TCTCCrrACACTGGCGTTACAAATGGCGTGTTACTACCTTTTACATCTACC 

ACCGTGGTGTGTGGTCTTGGCGTGCCTCTCTAA^GATTGCACTCTATTTA 

CGTGCTACGGTTATTCAATGATTGCTTCACTACTTTGTTTATGGTCGTCACG 

GTTTTGGGGGCTATCGTGGCCAGCAGGTGCCATCAGCGCCCCAAATTAAA 

GAAGTCCCTTGCGCTGGTGATCTCCGCAACATACAGTATGGCTGTGAGCA 

TrAAGATGAATGCGCTGTTGTATTTCCCTGCAATGATGATTTCTCTATTCAT 

CCrrAATGACGCGAACGTAATCCTTACTTTGTTGGATCTCGTTGCGATGAT 

TGCATGGCAAGTCGCAGTTGCAGTGCCCTTCCTGCGCAGCnTCCGCAACA 

GTACCTGCATTGCGCriTrAATTTCGGCAGGA^GTTTATGTACeAATGGAG 

TATCAATTGGCAAATGATGGATGAAGAGGCTTTCA^TGATAAGAGGTTCC 

AOTGGCCCTTTTAATCAGCCACCTGATAGCGCTCACCACACTGTTCGTCA 

CAAGATACCCTCGCATCCTGCCCGATTTATGGTCTTCCCTGTGCCATCXJGC 

TGAGGAAAAATGCAGTGCTCAATGCCAATCCCGCCAAGACTATTCCATTC 

GTrCTAATCGCATCCAACTTCATCGGCGTCCTATTTTCAAGGTCCCTCCAC 

TACCAGTTTCTATCCTGGTATCACTGGACTITGCCTATACTGATCTm^^ 

CGGGAATGCCCTTCTTCGTTGGTCCCATTTGGTACGTCTTGCACGAGTGGT 

GCTGGAATTCCTATCCACCAAACTCACAAGCAAGCACGCTATTGTTGGCA 

TrGAATACTGTTCTGTTGCTTCTATTGGCCTTGACGCAGCTATCTGGTTCGG 

TCGCCCTCGCCAAAAGCCATCTTCGTACCACCAGCTCTATGGAAAAAAAG 

CTCAACTGA 



S. cerevisiae Alg3p 

MEGEQSPQGEKSLQRKQFVRPPII)LWQDLKDGVRYVIFDCRA1«}LIVMPLLI^ 

EESMIX3aiIKKVAYTEIDYKAYMEQIEmQII)GMIJDYSQVSGGTGPLVYPAG 

HVLIYKMMYWLTEGMDHVERGQWFRYLYLLTLALQMACYYLLHLPPWC^ 

VIACI^KRIJIS]yVLRLF]^CFmJTV[VW 

ISATYSMAVSIKMNAIXYFPAMMISLFILNDANVILTIJX>LVAMIA 

WFOlSFPQQYmCAFNFGRKFMYQWSINWQMMDEEAFl^KREHLALLISHL 

lALTTLFVTRYPRILPDLWSSLCHPIJlKNAVLNANPAKTIPFVLI^^ 

RSUIYQFI^WYHWTLPnJFWSGMPFFVGPIWYVIJffiWCWSYPPNSQASTl, 

LIAIl'mrQXLIALTQIJSGSVAIJVKSHIJlTTSS 
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.nGUBE6 

P. pastaris ALG3 , ^ 

ATGCCTCCGATAGAGCCAGCTGAAAGGCCAAAGCTTACGCTGAAAAATGT 

TATCGGTGATCTAGTGGCTCTTATTCAAAACGTTITATTTAACCCAGATTTT 

AGTGTCrrCGTTGCACCTUITITATGGTTAGCTGATTCCATTGTTATCAAGG 

TGATCATTGGCACTGTTTCCTACACAGATATTGATmTCTTCATATATGCA 

ACAAATCnTAAAATTCGACAAGGAGAATTAGATTATAGCAACATATTTG 

GTGACACCGGTCCATTGGTTTACCCAGCCGGCCATGTTCATGCTTACTCAG 

TACTTTCGTGGTACAGTGATGGTGGAGAAGACGTCAGTTTCGTTCAACAA 

GCATTTGGTTGGTTATACCTAGGTTGCTTGTTACTATCCATCAGCTCCTACT 

ITITCTCTGGCTTAGGGAAAATACCTCCGGTITATTTTGTTTTGTTGGTAG^ 

GTCCAAGAGACTGCATTCAATATTTGTATTGAGACrCTTCAATGACTGlTT 

AACAACATTTTTGATGTTGGCAACTATAATCATCCTTCAACAAGCAAGTAG 

CTGGAGGAAAGATGGCACAACTATTCCATTATCTGTCCCTGATGCTQOAG 

ATACGTACAGTTTAGCCATCTCTGTAAAGATGAATGCGCTGCTATACCTCC 

CAGCATTCCTACTACTCATATATCTCATTTGTGACGAAAATTTGATTAAAG 

CCTTGGCACCTGTTCTAGTTTTGATATTGGTGCAAGTAGGAGTCGGTTATT 

CGTrCATTTTACCGTTGCACTATGATGATCAGGCAAATGAAATTCGTTCTG 

CCrACnTTAGACAGGCTTTTGACTTTAGTCGCCAATTTCTTTATAAGTGGA 

CGGTTAATTGGCGCTTTTTGAGCCAAGAAACTTTCAACAATGTCCATTrrC 

ACCAGCTCCTGTTTGCTCTCCATATTATTACGTTAGTCTTGTTCATCCTCAA 

GrrCCTCTCTCCTAAAAACATTGGAAAACCGCTTGGTAGATTTGTGTTGGA 

CATTTTCAAATTTTGGAAGCCAACCTTATCTCCAACCAATATTATCAACGA 

CCCAGAAAGAAGCCCAGATTTTGTTTACACCGTCATGGCTACTACCAACTT 

AATAGGGGTGCTITITGCAAGATCTTTACACTACCAGTTCCTAAGCTGGTA 

TGCGTTCTCTTTGCCATATCTCCTTTACAAGGCTCGTCTGAACTTTATAGCA 

TCTATTATTGTTTATGCCGCTCACGAGTATTGCTGGTTGGTTTTCCCAGCTA 

CAGAACAAAGTrCCGCGTTGTTGGTATCTATCTTACTACTTATCCTGATTC 

TCATTTTTACCAACGAACAGTTATTTCCTTCTCAATCGGTCCCTGCAGAAA 

AAAAGAATACATAA 



P. pastoris Alg3p 

MPPffiPAERPKLTIiasrS/IGDLVALIQlSrS^IJWDFSVFVAPI^ 
WSYTOroFSSYMQQIEORQGELDYSNlFGDTGPLVYPAGHVHAYSVLSWYS 
DGGEDVSFVQQAFGWLYLGCLLLSISSYFFSGLGEaPPVYFVLLVASKRLHSIF 
VLMJT^CLTTFIMIAinil^QASSWRKDGTTIPI^W 

ALLYlJAFLIXimCT>ENIJKALAPVLVIJLVQVGVGYSnLPIJIYDIX2A>ffiIR 

SAYFRQATOFSRQFLYKWTVNWKFLSQETFNNVHFHQLUfAIJ^^ 

I^PKNIGKPLGKFVLDIFKFWKPlI^PTIOTDSroPERSPDF\nriA^^ 

AI^IiIYQFI^WAFSIJ>YLLYKAia2gFIASIIVYAAHEYCWLVFPATEQSSAL 

LVSnXLBLn-IFTNEQLFPSQSVPAEKKNT 
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FIGURE 7 (sheet 1) 



P. pastoris ALG3 BLAST 



Sequences producing significant alignments: 

586444|sp|P38179|AIiG3 YEAST Dolichyl-P-Man :Man (5 > GlcNAc ( . 
12802365 I gb|AAKQ7848>l I AF309689 10 putative NOT-56 manno. 



984725 |qb | AAA75352 . 1 1 ORF 1 



7492702 Ipir 



16226531 



25367230 



25814791 



17535001 



5b 



pxr 



emb 



[T39084 probable mannosyl transferase - fissi. 
AAL16193.l|AF428424 1 At2g47760/F17A22 . 15 [A. 

Ara. 



ref 



1654000 1 erob 



13279206 



22122365 



21292031 



IBB4919 Hot56-like protein [imported] 
CAB7P171.21 Hypothetical protein K09E4.2 

Putative plasma xneihbrane membr. 



NP 496950. l| 



CAA70220-1 



AAH04313.1 



Not56-like protein [Homo sapiens. 
AAH04313 Unknown (protein for IMA. 



qb|EAA04176.1 



1780792 1 emb I C:AA71167 . 1 



agCP338B [Anopheles gambiae str. 
lethal (2) neighbour of tid [Droso. 



(bits) Value 



ref Imp 666051.11 hypothetical protein MGC36684 . . . 150 



228 
212 
206 
176 
164 
164 
161 
160 
155 
154 
150 
120 
114 



26-58 
Be-54 
4e-52 
86-43 
26-39 
3e-39 
2e-36 
3e-36 
2e-36 
2e-36 
3e-35 
4e-26 
3e-24 



Alignments 

S. cerevlsiae 
Score « 228 bits (580), Expect « 2e-58 

Identities ^ 154/429 (35%), Positives « 229/429 (53%), Gaps ° 37/429 (8%) 

Query: 9 RPKLTIiKWIGDLVALIQNVLFNPDFSVFVAPIJJWIiADSIVIKVIIGTVSYTDIDFSS™ 68 

RP L L DL ++ V+F+ ++ V PLL L +S++ K+II V+yT+ID+ +yM 

Sbjct: 20 RPPIDLWQ DLKlXSVRyVIFDCJlAmiVMPIiIiILFESML^ 76 

Query: 69 QQIFKIR-QGELDySNIF(3DTGPLVYPAGHVHAySVLSWYSIX3GEDVSF7QQAF 127 

+QI 1+ G LDYS + G TGPIiVYPAGHV Y ++ W ++G + V Q F +LYIi 
Sbjct: 77 EQIEMIQIJX3«XiDYSQVSGGTGPLVYPAGHVLIYKMMYWLTEGMDHVK^ 136 

Query: 128 CLIiliSISSYFFSGIiGKIPPVYFVIiLVASKllLHSIFVLRIiFS^ IIIiQ 184 

L Ij ++ y+ Jj +PP VL SKRIjHSI+VIiRIiFMDC TT M+ T+ I+ 
Sbjct: 137 TIiAL<»1ACYY---IJmPPWCVVIiACriSKRIiHSIY^ 193 

Query: 185 (^ASSWRiOXSTTIPLSVPDAAITrYSIiAlSVKMNXXXXXXXXXXX^^ 244 

+ K ++ L + + TYS+A+S+KMN D N+I L 

Sbjct: 194 RCHQRPKLKKSLALVI SATYShlAVSIKMNALLYPPAMraSIiFILindANVILTLLDLV 250 

Query: 245 XXXXZXXXXXYSFILPLHYDDQAMEIRSAYFRQAFDFSRQFLYKmVNWRFIiSQETFiaNV 304 

F+ Y AF-l-F R+F-I-Y-I-W+-I-NW-I- -f +E 

Sbjct: 251 AMIAWQVAVAVPFL RSFPQQYLHCAFNFGRKFMYQWSINWOMMDEEAraDK 301 

Query: 305 HPHQLLFAIiHIITL-VLFIIiKFLSPKMIGKPLGRFVIJSIFKPinCPTM 362 

FH L H+I L LF+ R + D++ L ++N +P ++ 

Sbjct: 302 RFmiALLISHLIAIiTTLFVTRY PRILPDLWSSLCHPLRKNAVLNANPAICT 351 

Query: 363 PDFVYTVMATTNLIGVLFARSJjHYQFLSHYAFSLPYIJjYKARIOTIASII^^^ 422 

F V+ +li IGVIiF+RSLHYQPIiSWy ++IiP L++ + + P I Y HE+CW 
Sbjct: 352 IPF VMASNPIGVLPSRSiaYQFLSWYHWTLPILIPWSGMPFFVGPIWYVLHEWCWN 408 

Query: 423 VPPATEQSS 431 

+P Q+S 
Sbjct: 409 SYPPNSQAS 417 
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FIGURE 7 (sheet 2) 

Neurospora crassa 

Score = 212 bits (540) , Expect = 8e-54 

Identities = 140/400 (35%) , Positives = 212/400 (53%) , Gaps = 29/400 (7%) 

Query: 35 SVFVAPIiWLADSIVIKVIIGTOSYTDIDPSSyMQQIFICTRQGELDYSNIFro 94 

S + P L+L D+++ +11 V YT+ID+++YM+Q+ +1 GE VY+ + G TGPLVYP 
Sbjct: 33 SKtilPPALFLVDAIJXXSIiIIWKVPyTEIDWAAYMEQ^^^ 92 

Query: 95 AGHVHAySVIiSWYSIX3GEDVSFVQQAFGWLYI.GCIiIjLSISSy^^ 154 

A HV+ y+ L +D G ++ QQ F LY+ L + + y+ K PP F LL 

Sbjct: 93 AAHVyIyTCLYHLTDE6KNIIlI4AQQIlFAGLY^^r^IA^ 149 

Query: 155 SKRLHSIFVLRLENDOiTTFIMATIIILQQASS 214 

SKRIHSIPVLR FmC + I Q+ +W+ A Y+L + VK 

Sbjct: 150 SKRISSIFVLRCENDCFAVLFLWIAIFFFQR-RNWQA GALLYTLGLGVK 197 

Query: 215 MJKXXXXXXXXXXSXXXOJENLIKMiAPXXXXXXXXXXXXYSFII^ 274 

M ++L F+HY+Y 
Sbjct: 198 MTLLLSIiPAVGIVLFIjGSG-SFVTTIiQLYATMGLVQILIGVPFL--AHyPTE Y 24 7 

Query: 275 raQAFDFSRQFLYKWTVNWRFLSQETFNNVHFHQLLFAimiTLVLFI-I^^ 333 

+AF+ SRQP +KWTVNWRF+ +E F + F L AIiH++ L +FI +++ P K 
Sbjct: 248 LSRAFELSRQFFFKWTVNWRFVGEEIFLSKGFALTLIiALHVL^^ 3 05 

Query: 334 PIiGRFVLDIFKFWKPTLS-PTNIINDPERSPDFVYTVl^T^ 392 

Ii + + + KPIH-P+ ++P++T + +N +G+LFARSIiHYQF 

Sbjct: 306 SLVQLISPVLLAGKPPLTVPEHRAAARDVTPRYimTILSAmVGLLFARSLHY 365 

Query: 393 AFSLPYLLYKARLNFIASIIVYAAHEYCWLVFPATEQSSA 432 

A+S P+I1L++A L+ + +++A HE+ W VFP+T SSA 
Sbjct: 3 66 AWSTPFIjLWRAGLHPVIjVYLLWAVHEWAWNVFPSTPASSA 405 

Schlzosacchajroiayces pombe 
Score = 176 bits (445), Expect = 8e-43 

Identities = 132/390 (33%), Positives = 194/390 (49%), Gaps = 35/390 (8%) 

Query: 42 LWLADSIVIKVIIGTVSYroiDFSSYMQQIFiaRQGEIiDYSNIFGDTGPLV^ 101 

L L + + II V YT+1D+ +YM+Q+ GE DY ++ G TGPLVYP GHV Y 

Sbjct: 30 LLLIiEIPPVFAIISKVPYTEIDWIAXMEQVNSFIJiGERDy^ 89 

Query: 102 SVLSWYSDGGEDVSFVQQAFGWLYLGCMiLSISSYPFSGLGKIPPVY^ isi 

++L + +DG6 ++ Q F ++Y + +1 Y F +' + P +VLL+ SKRLHSI 
Sbjct: 90 TLLYYLXTCGTraVRAQYIPAFVYW--ITTAIVGYLFK-rVRAPFYIYVIiLI^^ 146 

Query: 162 FVIiRLPNDClTTFIJaATIIILQOASSWRKDGTTIPLSVPDA^ 221 

P+LRLFND + Ii + 1+ W + A+ S+A SVKM+ 

Sbjct: 147 FIIiRIJOTXSFMS-LPSSIJIMSCKKKWVR — ASILLSVACSVKMSSLLYV 194 

Query: 222 3QCXXXXXXXXCDEOT,IKAIAPXXX2CXXXXXXXX^ 281 

^ P + + + +y+ QAPDF 

SbDCt: 195 PAYLVL LLQILGPIOCTWMHIFVIIXVQILFSIPF LAYFWSYWTQAFDF 242 

Query: 282 SRQFLYKPm/NWRFLSQETFNNVHFHQLLFAmilTLVLFILKFLSP^ 341 

R P YKWTVNWRF+ + F + F + LH+ LV F K + + p 
Sbjct: 243 CSRAFDYKWTVNWRFIPRSIFESTSFSTSILFiaVALLVAFTCKHW 295 

Query: 342 IFKFWKPTLSPTNIINDPERSPDPVYTVMATTOIilGVI^ 4OI 
P L+ + +P+F++T +AT+NI.IG+L ARSLHYQP +W+A+ PYL Y 
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Sbjct: 296 -FAMVNSl^TLKPLPKLQIATPNFIFTAIATSNLIGIIiCA^ 354 

Query: 402 KARLNFIASUVYAAHEYCWLVFPATEQSS 431 

+A I ++ EY W VFP+T+ SS 

Sbjct: 355 QASFPAPIVIGLWMWEYAWNVFPSTKLSS 384 
Arahidopsis thaXiana 

score = 164 bits (415), Expect = 2e-39 , . ^ nt^y^on (^^\ 

Identities = 131/391 (33%), Positives = 194/391 (49%), Gaps = 29/391 (7%) 

Ouerv- 42 LMIJUJSIVIKVIIGTVSYTDIDFSSYMQQIFKIRQGEIJ^YSNIFGDTGPLVy^^ 101 

L IAD+I++ +11 V YT ID+ +YM Q+ GE DY N+ GDTGPLVYPAG ++ Y 

Sbjct: 39 LIIMAILVJ^IIAYVPYTKIDWDAYMSQVSGFIK3GERDYGKMGDTGPLVYPAGFLY^ 98 

Ouerv- 102 SVLSWYSDGGEDVSFVQQAFGWLYLGCLLLSISSYFFSGLGn 1^1 

S + + G +V Q FG LY+ L + + Y + + +P LL SKR+HSI 

Sbjct: 99 SAVQOTiTGG--BVYPAQILFGVLYIVNLGrVIiIXYVKn>7--^ 154 

Query- 162 FVlJU^EOTXZLTTFIWIATIIIIi^ ^^1 

FVIiRIiFNDC Irf A++ + +RK + + +S A+SVKMN 

Sbjct: 155 FVLRIJOTXIFAMmiiHASMALFli YRKHHIi®«LV FSGAVSVKMNVLLYA 202 

Ouerv- 222 XXXTKXXXXXCDEmiKAIiAPXXXXXXXXXXXX^ 281 

^" N+I ++ F++ +Y AFD 

Sbjct: 203 PTLIiUjIiiaCAM-N^ SYIANAFDIi 251 

Query: 2B2 SRQFLYKWTVNWIUT^SQETFNNVHFHQLLFA^ 3*1 

R F++ W+VN++F+ + F + F L H+ LV F + K+ G +G 
Sbjct: 252 GRVFlHPWSVNFKFVPERVFVSKEFAVCIililAHluFLJi 310 

Ouerv- 342 iFKFWKP-TLSPTNIINDPERSPDFVYTVMATTNIiIGVLFARSIiHYQ 400 
^ ' p p +LS +++ + + V T M N IG++FARSIiHYQF SWY +SIiPYIiI* 

Sbjct: 311 HFFLTIiPSSLSFSDVSASRlITKEHVVTAMFVGHFIGIVFARS^ 370 

Query: 401 YKARUTFIASIIVYAAHEYCWLVFPATEQSS 431 

++ +]^++ E CW V+P+T SS 

Sbjct: 371 TOTPFPTWLRIjIMFIjGIEIiCWNVYPSTPSSS 401 
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FIGURES 

K.lactisALG3 

TTTGTTTACAAGCTGATACCAACGAACATGAATACACCGGCAGGTTTACT 

GAAGATTGGCAAAGCTAACCnriTACATCCTTTTACCGATGCTGTATTCAG 

TGCGATGAGAGTAAACGCAGAACAAArrGCATACATTTTACTTGTTACCA 

ATTACATTGGAGTACTATTTGCTCGATCATTACACTACCAATTCCTATCTT 

GGTACCATTGGACGTTACCAGTACTATTGAATTGGGCCAATGTTCCGTATC 

CGCTATGTGTGCTATGGTACCTAACACATGAGTGGTGCTGGAAGAGCTAT 

CCGCCAAACGCTACTGCATCCACACTGCTACACGCGTGTAACACATACTG 

TTATTGGCTGTATTCTTAAGAGGACCCGCAAACTCGAAAAGTGGTGATAA 

CGAAACAACACACGAGAAAGCTGAG 

K. lactis Alg3p 

FVYKLIPTNMNTPAGIXKEGKANmfi'FroAWSAMRVNAEQI^ 

GVLFARSI^QFI5WYHWTIPVLLNWANVPYPK:VLWYLTEJEWCWNSTO 

NATASlI.imasriY(JrSVLYSZEDPQTOKVVl^ 
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ELGURE9 

K. lactisALG3 BLAST 



SCOT'S E , 

Sequences producing significant alignments: (bits) Value 



SI 



Hi 



3i 



Hi 



3i 



586444 lsp{P3B179 I AIiG3 YEAST Dolichyl-P-Man:Man (5) GlcNAc ( . . . 125 le-2B 
984725 I qb I AAA75352 > 1 1 ORF 1 M 4e-19 



16226531 



25367230 



21292031 



20892051 



qb|AALl6193.l|AF428424 1 At2g47760/F17A22 . 15 [A. . - J72 le-12 



I /^Auxo-L^yj • 1 *• "——3 - ' • — ' - '■ 

pir||B84919 Not56- like protein tin?)orted] - Ara. . ._72 le-12 



qb|EAA04176.l| agCP3388 CAnopheles gambiae str _69 2e-ll 



ref IXP 148657.1 1 similar to Lethal (2) neighbour . . ._65 2e-10 



Alignments 



S. cerevisiae 



score B 125 bits (314) , Expect = le-28 

Identities. = 60/120 (50%), Positives = 83/120 (69%), Gaps = 1/120 (0%) 
Frame = +3 

Query: 66 ANIiIiHPFT-DAVFSAMRVNABQIAYIIiVTOyiG\nJFARSI^ 242 

++L HP +AV +A A+ I ++Irl- +N+IGVIiF+RSIjHYQFLSWyHWTIiP+Ii+ W+ 
Sbjct: 332 SSI<CHPLRKNAVUaNP--AKTIPFVLIASNPIGVLF 389 

Query: 243 NVPYPIiCOTiWYLTHEWaWSYPPmTASTLLHAi^^ 422 

+P+ + +WY+ HEWCWNSyPPN+ ASTLL ANT L+ +V + KHR 

Sbjct: 390 GMPFFTOPIWysnLflEWCTmSyPPNSQASTIJ^IiAIJ^^ 448 



A. thaliana 
Score = 72.0 bits (175), Eaqpect = le-12 

Identities » 42/107 (39%), Positives = 57/107 (53%), Gaps « 3/107 (2%) 
Frame = +3 

Query: 84 FTDAVFSAMRVNAEQIAYIIJjVTNYIGVIiFARSLHYQFLSWYHWTIiFV^^ 263 

F+D ' S + + E + + V N+IG++FARSIjHYQF SWY ++IjP IiL PL 
Sbjct: 322 FSDVSASRI-ITKEHVVTAMFVGNFIGIVFARSIiHYQFYSWYFYSLPYIiLWRTPFPTWLR 380 

Query: 264 VLWYLTHEWCWNSYPPNATASTL LHACNTYCYWbYS*EDPQTRK 395 

++ +L E CWN YP ++S L LH WL DP K 

Sbjct: 381 LIMFLGIELCTNVYPSTPSSSGLLLCLHLIILVGLWIAPSTO 427 
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FIGUBEIO 

AxSS^C^iSsCXKSTAACXJATTAGTITATrACTGTTGTrA 

ATATATTCAGCCGACATrCTCGTrAATTrCAGATKK^GATGAAACTTTTAA^ 

GGGAACX:ATtAAATTTATrGGTACGTGGATTrGGTAAACAAACCTGGGA^^ 

AOX^GAGTATICTATTAGATCATGGGCTITCTrATrACXnTTI^^ 

TCCAGTAAACAAVTITACTGACCTAGAAAGTCATTGGAACllliiCATCACy^G^ 

GCATGCITAGGCmriTAGTITrATCATGGAATITAAACTACATCGTGAAAT^ 

AGGCAGCTTGGCAlTGCAAATCGCAAATATITGGATrATrTTCCAATTG'nTAATC 

CGGGCTGGTTCCATGCATCTGTGGAATTATTGCOTCTGCCGTrGCCATGTTGTTG 

TATGTAGGTGCCArcAGACACTCTUTACGCTATCnnXXACTGGGTCTAC^ 

CrrrACGAAAAGTTTAGCGTACAATITCCTGGCr AGTAT ACrAGGCrGGCCATTTG 

TXTTAATITrAAGCTKKICATTATGTITACATrACCTITrCAACCATAGAATrAm 

CTACCATCAGAACXGCATKXiACTGCrGTITGATATITrCATrGACTGCAm 

GTGATTGTCACTGACAGTATATITrACGGGAAGCTrGCTCCTGTATCATGGA^ 

TCnTATirTACAATGTCATTAATGCAAGTGAGGAATCTGGCCCAAATATTTTCGGG 

GTTOAGCCATCGTACTACTATCCACTAAATITGTrACTGAATTrCCCACTGCCTGT 

GCTAGTmAGCTATTITGGGAATrTrCCATITGAGATTATGGCCATTATGGGCAT 

CATTATTCACATGGATTGCCGTITrCACTCAACAACCTCACAAAGAGGAAAGATT 

TCTXJTATCCAATITACGGGTTAATAACnTrGAGTGCAAGTATaSCCnTrrACAAAG 

TGTTGAATCTATrCAATAGAAAGCCXjATTCrrAAAAAAGGTATAAAGTrGTCAGT 

TITATTAATTGTTGCAGGCCAGGCAATGTCACGGATAGTGGCnTGGTGAACAAT 

TACACAGCTCXTATAGCCGTCTACGAGCAATTITCrTCACTAAATCAAGGTGGTG 

TGAAGGCACCGGTAGTOAATGTATGTACGGGACGTGAATGGTATCACrrCCCAAG 

TrCTrrCCTXKrnKXAGATAATCATAGGCTAAAAT TTGIT AAATCTGGATTTGATG 

GTCTTCTTCCAGGTGATmCCAGAGAGTGGTTCTATTITCAAAA^GATTAGAACT 

TTACCTAAGGGAATGAATAACAAGAATATATATGATACCGGTAAAGAGTGGCCG 

ATCACTAGATGTGATTATriTATrGACATCGTCGCCCCAATAAATTTAACAAAAG 

ACGTTITCAACCCrCTACATCTGATGGATAACTGGAATAAGCTGGCATGTGCTGC 

ATrCATCGACGGTGAAAATTCTAAGATITrGGGTAGAGCATnTACGTACCGGAG 

CCAATCAACCGAATCATGCAAATAGTnTACCAAAACAATGGAATCAAGTGTACG 

GTGirCGTTACATTGATTACTGTITGTITGAAAAACCAACTGAGACTACTAATrGA 

S. cerevisiae Alg9p 

MNCKAVTISIXLIJLn.TOVYIQPTFSIJSDaDErFNYWEPLNLLVRGFGKQ 

YSmSWAFLLPFYCILYPVNKFroiJESHWlOTITRACLGFFSF^^ 

lANIWlIFQIJFNPGWEHASVEIJLPSAVAMIXYVGATiaiSLRYLSTGST^ 

n.ASILGWPFVLE^IJ?LCIJBYLFNHIUISmTAn)CCLIFSLTAFAVIVTO 

VSWNlIPYNSTNASEESGPNffGVEPWYYYPLNIXLl^LPVL\^ 

ASIJTWIAVITX2QPHKEERIT.WrYGLIIXSASIAF^ 

VAGQAMSRWALVNNiTAPIAVYEQFSSIJvfQGGVKAPVVNVCTGREW^ 

DJraiO.KI^SGroGLLPGDITESGSIFKiaRllJ'KGMNNKNiyDTGKEWP 

DIVAPINLTBaDVFNPIJn.MDN WNKI ACAAITOGENSmGRAFyWE 

KQWNQVYGVRYIDYCLFEKPTETTN 
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FIGURE 11 

P. pastoris ALG9 

TGGCCTTCCTGTCTGCTCGATACTTCCTTTTACAGTAACCAACATACATGTT 

CTCCAACATGCTCTTGTATGTATTGGCCTATTCTATCTTGAGACTTGATATC 

AACCTTUrATGGTATTATTTCAGACTGTGATGAAGTGTTCAACTACTGGGA 

GCCACTCAACTTCATGCTTAGAGGGTTTGGAAAACAGACTTGGGAGTATT 

CTCCAGAGTATGCCATCCGATCTTGGTCCTATCTAGTGCCACnTGGATAG 

CAGGCTATCCACCATTGTTCCTGGATATCCCTTCTTACTACTTTTTCTACrrT 

TTrCAGACTACTGCTGGTTATTTTTTCATTGGTTGCAGAAGTCAAGTTGTA 

CCATAGTTTGAAGAAAAATGTCAGCAGTAAGATCAGTTTCTGGTACCTTCT 

ATITACAACCGTTGCTCCAGGAATGTCTCATAGCACGATAGCCTTATTACC 

ATCCTCITITGCTATGGTTTGTCACACTTTTGCCATTAGATACGTCATTGAT 

TACCTACAATTACCAACATTAATGCGCACAATCAGAGAGACTGCTGCCAT 

CTCACCAGCTCACAAACAACAACTAGCCAACTCTCTC 

P. pastoris Alg9p 

WPSCIXDTSFYSNQHTCSFrCSCMYWPn^ZDLISTFyGnSDCDEVFNYWEPL 
MTvIUlGFCKQTWEYSPEYAIRSWSYLVPLWIAGYPPLFLDIPSYYFFYFFTlLLL 
VIFSLVAEVia.YHSLKIO^SSKISFWYIXFrTVAPGMSHSTIALIJ»^ 
TFAmYVIDYI^IJrnjvlRTlRETAAISPAHKQQLANSL 
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FIGURE 12 (sheet 1) 



P. pastoris ALG9 BLAST 



Score E 

Sequences producing significant alignments: 





6324110 Iref 


NP 014180. l| 




21296668 tqb 


EAA08813 .1 






7019765 iemb 


CAB75773.1 






26341066 


db-| 


IBAC34195.:] 




gi 


16551378 


gi> 


AAL2579B.1 






19527202 


ref 


NP 598742.1 




12053349 


etnb 


CAB66861.ll 



(bits) Value 



catalyzes the transfer of manno. . .131 

agCP7810 [Anopheles gambiae str 110 

putative mannosyltrMsf erase inv. . . 104 
uzmamed protein product [Mus rau. , ■ 99 
DIBDl [Homo sapiens] 
J_ RIKEN. CDNA 8230402H15 [Mus mus- . ._99 
hypothetical protein [Homo sapi. . . 99 



le-29 
2e-23 
le-21 
4e-20 
4e-20 
4e-20 
4e-20 



Aligrrmpints 



S. cerevisiae 
Score = 131 bits (329) , Expect = le-29 

Identities = 62/141 (43%), Positives = 91/141 (64%), Gaps = 1/141 (0%) 
Frame = +2 

Query: 200 ISTFYGIISDCDEVFNYWEPLNFMI^GFGKQTWEySPEyAlR^ 376 

I + +ISDCDE FNYWEPLN ++RGFGKQTWEySPEY+IRSW++Ii+P + YP F 
Sbjct: 21 IQPTFSLISIXDETFNYWEPLNIjLTOGPGKOTWEYSPEYSIRSWA 80 

Query: 377 IJ)IPSXXXXXXXIUiIiLVIFSLVAEVKLYHSLK3CNVSSKISFWYIiLB^^ 556 

D+ S R Ii FS + E KI,+ + +++ +1+ +++F PG H+++ L 

Sbjct: 81 TDliESHWNFFITRACLGFFSFIMEPKIiHREIAGSIiAWIANIW 140 

Query: 557 LPSSPAMVCHTPAIRYVIDYL 619 

LPS+ AM+ + A R+ + YL 
Sbjct: 141 IiPSAVAMLLYVGATRHSUlYI. 161 



Anopheles gambiae 
Score = 110 bits (274), Expect = 2e-23 

Identities = 58/130 (44%), Positives = 79/130 (60%), Gaps « 3/130 (2%) 
Frame = +2 

Query: 197 LISTFYGIISDCDEVFimJEPIiNFMLRGFGKQTWEYSPEYAIRSWSYLVPIiWIJ^^ 376 

L S Y IISDCDE +NYWEPL+++L+Q G QTWEYSPE+A+RS+SY LW+ G P 
Sbjct: 34 LQSALYSIISDCDETYNYWBPXiHYIiLKGKGFQTWEYSPEFAIiRSySY---IiW^ 90 

Query: 377 LDIPS XXXXXXXRLLIjVIFSLVAEVKLYHSLKKNVSSKISFWYLIiPTTV^^ 547 

L++ RIjL+ +E +LY L + ++ +I«riF + GM S+ 

Sbjct: 91 IiQUffimGVLIFYFVRCIiIAVTCALIjEYI^ ISO 

Query: 548 lALLPSSFAM 577 

AIiLPSSF+M 
Sbjct: 151 AAIOiPSSFSM 160 
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FIGl}B£12(sheet2) 

S, pombe 

Score = 104 bits (260), Expect « le-21 

Identities » 58/157 (36%) , Positives = B5/157 (54%) 

Frame - +2 

Query: 197 LISTFYGIISDCDEVFNYWEPLKTFMIJIGFGKQTWEYSPEYAI^ 376 

L S + +1 Da)EV+NYWEPL+++L G+G QTWEYSPEYAIRSW Y+ + G+ 
Sbjct: 26 LTSASFKVIDDCDEVYimJEPIJmiLYGYGLQTWEYSPEYAIRSW^ 85 

Query: 377 XJDlPSXXXXXXXRLLLVIFSLVAEVKLYHSIiKKNVSSKISF^^ 556 

L + R +L FS E L ++ +H + ++ V GM ++ + 

Sbjct: B6 LGIiSRIiHVFrFIRGVIiACFSAFCETNLIIAVaRK^^ 145 

Query: 557 LPSSFAMVCHTFAIRYVIDYLQI*PTIiMRTIRETAAIS 667 

LPSSFAM T A+ L P+ RT++ + 1+ 

Sbjct: 146 LPSSPAMHMVTI*ALS---AQIiSPPSTKRTVKVVSPIT 179 



M. ijiusculus 
Score = 99.4 bits (246), Expect = 4e-20 

Identities = 57/143 (39%), Positives = 76/143 (53%), Gaps = 1/143 (0%) 
Frame b +2 

Query: 152 SPTCSC^^rWPII^*DLISTFYGIISDa5EOTNYTOPIiNFMIJRGFGI^ 331 

+P S + +LS li + ISDCDE FNYWEP ++++ G G QTWEYSP YAIRS+ 

Sbjct: 55 KBEG3rAFKCLLSKRJbC2MAjS^ 

Query: 332 SY-LVPLWIAGYPPLFIiDIPSXXXXXXXJOiLLVIFSLVAEVKLYHSLK^ 508 

+Y L+ W A + Ij R LL S V E+ Y ++ K +S L 

Sbjct: 115 AYIiLIjHftWPAAFHARIIjQTNKILVFyPIiRCIiL^ 174 

Query: 509 LFTTVAPGa^HSTIALLPSSFAM 577 

F +4 GM S+ A LPSSF M 
Sbjct: 175 AFLVLSTGMFCSSSAFLPSSFCM 197 



H. sapiens 
Score B 99.4 bits (246), Expect = 4e-20 

Identities = 56/143 (39%), Positives = 76/143 (53%), Gaps = 1/143 (0%) 
Frame = +2 

Query: 152 SPTCSCMYWPIIiS*DIilSTFyGIISDCDEVFNYWEPIi»FMl^PGKQTWBySP 331 

+P S + +LS L + ISDCDE FNYWEP ++++ G G QTWEYSP YAIRS+ 

Sbjct: 55 APEGSTAFKCLLSARTiCaUOiLSinSDCDETTOYWEPTH^ 114 

Query: 332 SY-LVPLWIAGYPPLFIiDIPSXXXXXXXRIiLLVIFSLVAEVKLYHSI.KKOTSSKISFW^ 508 

+Y L+ W A + L R lOi S + E+ Y ++ K +S L 

Sbjct: 115 AYIiLLHAWPAAFHARILQTNiaLVFYFLRCIjL^^ 174 

Query: 509 LFTTVAPCMSHSTIALLPSSFAM 577 

F ++ GM S+ A LPSSF M 
Sbjct: 175 AFLVLSTGMFCSSSAFLPSSFCM 197 
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FIGURE 13 



5". cerevisiae ALG12 

ATGCGTTGGTCTGTCCTTGATACAGTGCTATTGACCGTGATITCCrTTCATCTAAT 

CCAAGCT(XATTCACCAAGGTGGAAGAGAGTTTTAATATTCAAGCCATTCATGAT 

ATmAACCTACAGCGTATITGATATCrCCCAATATGACCACTTGAAATTrCCrGG 

AGTAGTCCCTAGAACATTCGTTGGTGCTGTGATrATTGCAATGCTrTCXSAGACCTT 

ATCmACTTGAGTTCTITGATCXAAACITCCAGGCCTACG TCTAT AGATGTTCAA 

TrGGTCGTrAGGGGGATTGTTGGCCTCACCAATGGGCri'lUlll lATCTATTTAAA 

GAATTGTITGCAAGATATGTTTGATGAAATCACTGAAAAGAAAAA GGAAGAAAA 

TGAAGACAAGOATATATACATTTACGATAGCGCIXKjTACATGGlllUllirATTrT 

TAATTGGCAGTTrCCACCTCATGTrCTACAGCACTAG GACTC TGCCTAATTTTGTC 

ATGACTCTGCCTCTAACCAACGTCGCATTGGGGTGGGTTTTATTGGGTCGTrATAA 

TGCAGCTATATrCCTATCriGaKrrCGTGGCAATTGTATITAGACT^ 

CnxrrCAGTGCTGGTATTGCIUrATITAGCGTCATCTrCAAGAAGATTTC^ 

GATGCTATCAAATTCGGTATCTITGGCTrGGGACTTGGTrCCGCCATCAGTATCAC 

CGTraATTCATATlTCTGGCAAGAATGGTGTCTACCTGAGGTAGATGGi riui i GT 

TCAACGTGGTTGCGGGTTACGCrreCAAGTGGGGTGTGGA GCCA GTtACrGCTTA 

TirCACGCATTACTTGAGAATGATGTrrATGCCACCAACTGTITrACTATrGAATr 

AClTCGGCTATAAATrAGCACCIGCAAAATTAAAAATTGT(:nx:ACTAGCATCTOT 

TTCCACATTATCGTCTrATCCnTCAACCrCACAAAGAATGGAGATTCATCATCTA 

CGCreTrCCATCTATCATGTTGCTAGGTGCCACAGGAGCAGCACATCTATGGGAG 

AATATGAAAGTAAAAAAGATTA(XAATGTriTATGTTrGGCTATATTGCCCTTATC 

TATAATGACCTCCTITITCATrrCAATGGCGTTCnTGTATATATCAAGAATGAATT 

AT(XAGGa3GCGAGGCTrrAAL'lli;illiAATGACATGATTGTGGAAAAA AATA T 

TACAAACGCTACAGTrCATATCAGCATACCT<XTTGCATGACAGGTGTCACrTTAT 

TTGGTGAATTGAACTACGGTGTGTACGGCATCAATTACGATAAGACTGAAAATAC 

GACTTTACTGCAGGAAATGTGGCCXTCCITTGATTTCTTGA 

ACCGCCTCTCAATTGCCATTCGAGAATAAGACTACCAACCATTGGG AGCrAGTTA 

ACACAACAAAGATGTTrACTGGATTTGACCCAACCTACATr AAGA ACTTTGTTTT 

CCAAGAGAGAGTGAATGTTITGTCTCTACrCAAACAGATCATTTTCGACAAGACC 

CCTACCGTTTTTTTGAAAGAATrGACGGCCAATTCGATTGTrAAAAGCGATGTCTT 

CTTCACCTATAAGAGAATCAAACAAGATGAAAAAACTGATTGA 

S. cerevisiae Algl2p 

MRWSVIX>TVIXTVISFHLIQAPFnCVEESFMQAJHDILT^^ 

RTFVGAVHAMI^RPYLYLSSLIQTSRPTSroVQLVVRGIVGLTNGI^FIYI^CI^ 

FDErrEKKKEENEDKDIYIYDSAGTWFLIJLIGSFHIJv^ 

GW\a.LGRYNAAIFLSALVAIVRRIJEVSALSAGIAIJSVlFKK^ 

AISITVDSYFWQEW(XPEVDGFIJrNWAGYASKWGVEPVTAYFniYI^^ 

LLLNYFGYKIJ^AKLKIVSIJ^SIJ'HnVI^FQPHKEWR^ 

E^IMKVKX^NVIXIJ\ILPI^IMTSFFISMA^.YISR^ 

ATVmSIPPavn'GVTLFGEimGVYGINYDKlEhmXLQEMV^ 

FENKTTNHWEL\TmKMFTGFDPTYIKNFWQERVNVI^^^ 

ANSIVKSDVEFTYKIUKQDEKTD 
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FIGURE 14 



P. pastoris ALG12 

TCGGTCGAGAATGATAACTGAAGAACTCAAAATCTCTCACACTTTCATCGT 

TACTGTACTGGCAATCATTGCATTTCAGCCTCATAAAGAATGGAGATTTAT 

AGTTTACAITGTTCCACCACTTGTCATCACCATATCTACAGTACTTGCACA 

ACTACCCAGGAGATTCACAATCGTCAAAGTTGCTGTTTTTCTCCTAAGTTT 

CGGCTCTTTGCTCATATCCCTGTCGTTTCnTTTCATCTCATCGTATAACTAC 

CCTGGGGGTGAAGCTTTACAGCATTTGAACGAGAAACTCCTTCTACTGGA 

CCAAAGTTCCCTACCTGTTGATATTAAGGTTCATATGGATGTCCCTGCATG 

CATGACTGGGGTGACTTTATTTGGTTACTTGGATAACTCAAAATTGAACAA 

TTTAAGAATTGTCTATGATAAAACAGAAGACGAGTCGCTGGACACAATCT 

GGGATTCTTTCAATTATGTCATCTCCGAAATTGACTTGGATTCTTCGACTG 

CTCCCAAATGGGAGGGGGATTGGCTGAAGATTGATGTTGTCCAAGGCTAC 

AACGGCATCAATAAACAATCTATCAAAAATACAATTTTCAATTATGGAAT 

ACTTAAACGGATGATAAGAGACGCAACCAAACTTGATGTTGGATTTATTC 

GTACGGT(mTCGATCCTrCATAAAATTTGATGATAAATTATTCATTTATG 

AGAGGAGCAGTCAAACCTGAAAATATATACCTCATITGTTCAA'nTGGTGT 

AAAGAGTGTGGCGGATAGACITCrTGTAAATCAGGAAAGCTACAATTCCA 

ATTGCTGCAAAAAATACCAATGCCCATAA 

I 

P. pastoris Algl2p 

RMrreEUaSHTFIVTVIAIIAFQPHKEWRFIVYIW 

KVAWLI^FGSLLISI^FIJETSSYNYPGGEAIX^HIJ^EKLLLIJ)QSSLPVDIK^ 
MDWACMrGVTIJGYII>NSKLNNmVYDKTEDESIJ)T^ 
SSTAPKWEGDWIJaDWQGYNGI^IKQSIK2mF^^yGIUKMIRDAT^ 
RTVEEISFIKTDDKLFIYERSSQ 
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P. pastoris ALG12 BLAST 



Sequences producing significant alignments: 



score E 
(bits) Value 



CTi|l302525leinb|CAA96310.l| 



qi 1 19112221 



31 



115864569 



qi I 13129114 



qi I 22266724 



qi|l647B284 



ref 



ORP ™R03Ow [SacGharomyces cerev. . . 102 5e-21 

NP 595429>1| putative involvement in cell w. . . 56 5e-07 

CACB3681.ll putative dolichyl-p-man: Man7Gl... 53 4e-06 

UP 077010 >1 1 dolichyl-p-inannose:McUi7GlcNAc2. . ._53 4e-06 
qb|AAM94900.l|AF311904 1 membrane protein SB87 



emb 



ref 



emb|CAD22101.l| putative roannosyl transferase [M. 



53 4e-06 
52 8e-06 



Alignments 
S. cerevisiae 



Score = 102 bits (255), Ejqpect = 5e-21 

Identities = 74/258 (28%), Positives = 121/25B (46%), Gaps = 19/258 (7%) 



Query: 


B 


i?MITEELKZSHTFIVT\n:j^IIAFQPHKEWRFIVyiVPPLVITISTV^ 


187 




++ +LKI + + +++FQPHKEV?RFI+y VP +++ +T A L + K-f 




Sbjct: 


302 


KLAPAKLKIVSIiASLFHI IVLS FQPHKBWRFI I YAVPS IKLIiGATGAAHIiWENMKVKKIT 


361 


Query: 


IBB 


VXXXXXXXXXXXXXXXXXXXYNYPGGET^HLNEKL^^ 346 




+ NYPGGEAL N+ ++ + VH+ 




Sb j ct : 


362 


NVLCIjAILPLS IMTSFFISMAFLYTSRMNYPGGEALTSFNDMIV EKNITNATVHI 3 


417 


Query: 


347 


VPAC34TGVTLFGm3NSKLNNIiRrra3KT^ - LDTIWDSFNYVI SEIDLDSS 


505 




+P CMTGVTEiFG I1+ I YDKTE+ + L +W SP+++I S++ ++ 




Sbjct: 


418 


IPPCMTCVTLFGELNYGVYG INYDKTENmlLQE^IWPSFDFIJITHEPTASQrJPFENK 474 


Query: 


506 


TAPKWEGDWLKIDVVQGYNGINKQSIKNTIFN YGILKRMIRDATKIjDVGPIRTVF 


670 




T WE ++ + + G + IKN +F +LK++I D K F++ + 




Sbjct: 


475 


rrUHWE LVOTTKMFTGFDPTYIKNFVFQERVNVLSIiLRQI IFD — KTPTVFLKELT 


528 


Query: 


671 


RSFIlCFDDICriFIYERSSQ 724 








+ 1 D F y+R Q 




Sbjct: 


529' 


ANSIVKSDVFFTYKRIKQ 546 





S. pombe 

Score = 56.2 bits (134), Expect = 5e-07 

.Identities = 46/152 (30%), Positives = 62/152 (40%), Gaps = 11/152 (7%) 

Query: 65 IIAFQPHKEWRFIVYIVPPLVITISTVIAQL PRRFTIVKVAVXXXXXXXXXX 220 

+ +F HKEWRFI+Y + P S + AL +F I+++ 

Sbjct: 295 VySFIiGEQCEHRFIIYSI-PfTFZIAASAIGASIiCFNASK^ 353 

Query: 221 XSCXXXXXXXYNYPGGEALQHLNEKIiLIiUDQSSIiPVDIKVH^ 400 

Y YPGG AI* L E + VHMDV CMTG+T F L + 

Sbjct: 354 SSFIiLyVPQYAYP(3GLALTRLYE lENHPQVSVHMDVYPCMTGITRFSQLPS — 404 
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Query: 401 LNNLRIVYDKTEDESL DTIWDSENYVISE 487 

YDKTED + P+Y+I+E 
Sbjct: 405 WYYDKrEDPKMLSNSLFISQFDYlilTE 431 



FIGURE IS (sheet 2) 



Homo sapiens 
Score s= 53.1 bits (126), Eaqject = 4e-06 

Identities = 41/149 (27%) , Positives = 68/149 (45%) , Gaps =^ 6/149 (4%) 

Query: 59 LAIIAFQPHKEWRFIVYIVTPLVITISTVLAQLPRR - FTI VKVAVXXXXXXXXXX 220 

+A+ + PHKE RFI+Y PIiIT++L + +V 

Sbjct: 299 MALYSLIiPHKEIiRFIIYAFPMIiNITAARGCSYLIiNNYKKSWL 358 

Query: 221 XXXXXXXXXYirYPGGEALQHLNEKLLLIiIX^ 400 

+NYPGG A+Q L++ L+ Q+ D+ +H+DV A TGV+ F ++++ 
Sbjct: 359 SATALYVSHFNYPGGVAMQRI1HQ--LVPPQT DVLIiHIDVAAAQTGVSRFIiQVNSAW 412 

Query: 401 LNiraaVYDKrEDESLDTrWDSFNYVISE 487 

YDK ED T ++ +++ E 
Sbjct: 413 R YDKREDVQPGTGMLAYTHILMB 435 
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FIGURE 25 

iS cBrcvisiuB ALG6 

ATGGCCATTGGCAAAAGGTTACTGGtGAACAAACCAGCAGAAGAATCATT 

ITATGCTTCTCCAATGTATGATTmTGTATCCGTTTAGGCCAGTGGGGAA 

CCAATGGCTGCCAGAATATATTATCnTGTATGTGCTGTAATACTGAGGTG 

CACAATTGGACTTGGTCCATATTCTGGGAAAGGCAGTCCACCGCTGTACG 

GCGATTTTGAGGCTCAGAGACATTGGATGGAAATTACGCAACATTTACCG 

CnrCTAAGTGGTACTGGTATGATTTGCAATACTGGGGATTGGACTATCCA 

CCATTAACAGCATTTCATrCGTACCTTCTGGGCCTAATTGGATCTTTnrCA 

ATCCATCTTGGTTTGCACTAGAAAAGTCACGTGGCTTTGAATCCCCCGATA 

ATGGCCTGAAAACATATATGCGTTCTACTGTCATCATTAGCGACATATTGT 

TITACTTTCCTGCAGTAATATACTTTACTAAGTGGCTTGGTAGATATCGAA 

ACCAGTCGCCCATAGGACAATCTATTGCGGCATCAGCGATTTTGTTCCAAC 

CTTCATTAATGCTCAITGACCATGGGCACTTTCAATATAATTCAGTCATGC 

TrGGCCTTACTGCTTATGCCATAAATAACTTATTAGATGAGTATTATGCTA 

TGGCGGCCGTTTGTTTTGTCCTATCCATTTGrrTTAAACAAATGGCATTGTA 

TTATGCACCGATTITrmGCrTATCTATTAAGTCGATCATTGCTGTTCCCC 

AAATTTAACATAGCTAGATTGACGGTTATTGCGTTTGCAACACTCGCAACT 

TTrGCTATAATATTTGCGCCATTATATTTCTTGGGAGGAGGATTAAAGAAT 

ATTCACCAATGTATTCACAGGATATTCCCTTTTGCCAGGGGCATCTTCGAA 

GACAAGGTTGCTAACTTCTGGTGCGTTACGAACGTGTTTGTAAAATACAA 

GGAAAGATTCACTATACAACAACTCCAGCTATATTCATTGATTGCCACCGT 

GATTGGTTTCTTACCAGCCATGATAATGACATTACT TCATCCCAAAAA GCA 

TCrrCTCCCATACGTGTTAATCGCATGTTCGATGTCCnTITITCTTm 

TirCAAGTACATGAGAAAACTATCCTCATCCCACTTTTGCCTATTACACTA 

CTCTACTCCTCTACTGATTGGAATGTTCTATCTCTTGTAAGTTGGATAAAC 

AATGTGGCirrGTTTACGCTATGGCCTTTGTTGAAAAAGGACGGTCTTCAT 

TTACAGTATGCCGTATCTITCTTACTAAGCAATTGGCTGATTGGAAATTTC 

AGlTITATTACACCAAGGTTCTTGCCAAAATCnTrAACTCCTGGCCCTTCT 

ATCAGCAGCATCAATAGCGACTATAGAAGAAGAAGCTTACTGCCATATAA 

TGTGGTTTGGAAAAGTTTTATCATAGGAACGTATATTGCTATGGGCTTTTA 

TCATTTCTTAGATCAATTTGTAGCACCTCCATCGAAATATCCAGACTTGTG 

GGTGTTGTTGAACTGTGCTGTTGGGTTCATTTGCTTTAGCATATTTTGGCTA 

TCGTCTTATTACAAGATATTCACTTCCGGTAGCAAATCCATGAAGGACTTG 

TAG 

iS. cerevisiae ALG6p 

MMGKia,LVl^AEESF^ASPMYDFLYPFia»VGNQWIPEYIIFVCAVILRCTIG 

LGPYSGKGSPPLYGDFEAQRHWMEITQHLPLSKWYWYDLQYWGLDYPPLTA 

FHSYLUJLIGSFENPSWFALEKSRGFESPDNGIXTYMRSTVnSDIUf^ 

FTKWmRYimQSPIGQSIAASAIIJfQPSIMLIDHGHFQYNSVMmL^^ 

LIJ>EYYAMAAVCFVLSICTKQMALYYAPFFAYLI^RSLIfPKFNIAia,TV^ 

ATIJVTFAIIFAPLYFmGGIJaSimQCIHIUITFARGlFEDKYAlSIFW 

YKEI«fTIQQIXJLYSLIATVIGFIJ>AMIMTLIJEn'KBaiLLPYVLIACSM^ 

VHEKmIPLIJP^lI.YSSTDW^rva^LVSWlN^^/AIJTLW 

VSFLLSNWLIGl^SFnPM'LPKSLTPGPSISSINSDYRRRSLI^YNVVWKSFnGT 

YIAMGFYHFUJQFVAPPSK^DLWVIXNCAVGFICFSIFWLWSYYKIFTSGSK 

SMKDL 
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IIGURE26 



P. pastoris ALG6 

ATGCCACATAAAAGAACGCCCTCTAGCAGTCTGCTGTATGCAAGAATTCC 

AGGGATCTCTTTTGAAAACTCTCCGGTGTTTGATTITITGTCrCCTm 

CCCGCTCCTAATCAATGGGTAGCACGATACATCATCATCATCTTTGCAATT 

CTCATCAGATTGGCAGTTGGGCTGGGCTCCTATTCCGGCTTCAACACCCCT 

CCAATGTATGGGGATTTTGAAGCTCAGAGGCATTGGATGGAAATTACTCA 

GCATTTATCCATAGAAAAATGGTACTTCTACGACTTGCAATATTGGGGGCT 

TGACTATCCTCCCITGACAGCCTTTCATTCATACTTCTrTGGCAAATTAGGC 

AGCTTCATCAATCCAGCATGGTTTGCTTTAGACGTCTCCAGAGGGTTTGAA 

TCAGTGGATCTAAAATCGTACATGAGGGCX3ACCGCAATTCTCAGTGAGCT 

GTTATGTTTTATTCCAGCTGTCATTTGGTATTGTCGTTGGATGGGACTTAAC 

TACTTCAATCAAAACGCCATTGAGCAAACTATAATAGCGTCTGCTATTCTT 

TTCAATCCATCTTTAATTATCATAGATCATGGCCACTTCCAGTACAACTCA 

GTTATGCTAGGTTTTGCTTTATTATCCATATTAAATCTGTTGTACGATAATT 

TTGCATTAGCGGCrATTTTTTTCGTTCTTTCAATAAGCTTTAAGCAAATGGC 

TCTCrATTATAGCCCCATCATGTITITITACATGCTGAGTGTGAGTTGTTGG 

CCTITGAAAAACTTCAACTTGTTGAGATTGGCTACTATCAGTATTGCAGTA 

CTCTTGACTTITGCAACTCTATTACTGCCTTTTGTATTAGTAGATGGGATGT 

CACAAATTGGCCAAATATTATTCAGAGTITrCCCGTTTTCAAGAGGCTTGT 

TTGAGGATAAGGTGGCCAACmTGGTGTACAACGAATATACTGGTAAAG 

TACAAACAGTTATTCACTGACAAAACCCTTACTAGGATATCGCTAGTAGC 

AACTTTGATTGCAATTAGTCCGTCrrGCTTCATCATTJTTACTCACCCAAAG 

AAGGTTITACTACCGTGGGCTTTTGCTGCTTGCTCTTGGGCGTTCTATCm 

TCTCTTTCCAAGTCCACGAGAAATCAGTTTTAGTTCCATTGATGCCTACCA 

CTCTATTACTGGTAGAAAAAGACTTGGACATCATCTCAATGGTCTGCTGGA 

TTTCTAATATTGCCTTCTTCAGCATGTGGCCTCTATTAAAAAGAGACGGGC 

TGGCTTTGGAATATTTTGTCTTGGGAATATTGAGTAATTGGCTGATTGGA^ 

ACCTCAATTGGATTAGTAAATGGCITGTCCCCAGTTTCCTGATTCCAGGGC 

CTACTCTCTCCAAAAAAGTTCCTAAAAGAGATACTAAAACAGTTGTTCAT 

ACnCACTGGTTTTGGGGGTCAGTAACATTCGTTTCATACCTCGGAGCTACA 

GTTATCCAGTTCGTAGATTGGCTGTACCTTCCACCTGCCAAGTATCCAGAT 

ITGTGGGTTATTTTGAACACTACATTGTCGTTTGCTTGTTTCGGGTTGTm 

GGCTATGGATTAACTACAATCTGTACATTTTGCXSTGATITrAAGCTTAAAG 

ATGCTTAG 

P. pastoris Alg6 

MPHKRTPSSSLLYAMPGBFENSPVTOFI^PFGPAPNQWVARYnilFAILIRLAV 

GLGSYSGENTPPMYGDFEAQKEIWMEITQHLSIEKWYFYDLQYWGLDYPPLT 

AEHSYFFGKLGSFINPAWFALDVSRGFESVDIJCSYMRATAILSELL^^ 

Y(mWMGI>JYFNQNAffiQTIIASAILFl^SUiroHGHFQYNSVMm 

LYDNFAIAAIFFVI^ISFKQMALYYSPIMFFYMI^VSCWPLKNFNLI^ATISI 

A\a.LTFATIJXPFVLVDGMSQIGQILFRVFPFSRGlFEDKVA]SIFW(^^ 

YKQLFroKTLTTaSLVATLIAISPSawraPKKVLIJ»WAFAACSWAFnJ?SF 

VHEKSVLVPDVlPTTLLLVEKDIJ)nSMVCrWISl>^^ 

VLGILSNWLIGNO^WISKWLWSFLIPGPTI^KKVPK3U)TKTVV^ 

VTFVSYLGATVIQFVDWLYIJ»PAinrPDLWVIIOTTI^FAaFGIJ^^ 

yhjudeklkda 
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P. pastoris ALG6 BLAST 



Score E 

Sequences producing significant alignments: 



al 
ai. 
ai. 
ai 
ai 
at 
ai 
ai 
ai 
ai 



1420090 



74905B4 



19921070 I ref 



15240920 I ref 



eTDbiCAA99190.ll ORF yOR002w [Saccharomyces cerev 
glucosyltransf erase - fission yeast 
CG5091-PA iDrosophila melanoga 
glucosyltransf erase- like prote 



pirl 



7019325 I ref 



12002040 |gb 



T40396 



NP 609393. l| 



(bits) Value 



yp 198662. 1| _ 
jp 037471. l| dol ichyl - P - Gl c : Man9GlcHAc2 - PP- d 
AAG43 163.1|AF063604 1 brain Tny046 protein [H. . 

Probable dolichyl pyroptiosp.. 



117667l|sp|Q09226|AIjG6 CAEEL 



2130263 B |"qb|ERA147B3 . l| agCP4617 [Anopheles gambiae str.,. 
544178B I emb 1 CAB46771 . 1 | probable glucosyltransf erase [Sc.. 
13129070 1 ref iNP 0769B4.l| hypothetical protein MGC2840 s.. 
2996578 I emb I CAA12L76 . 1 1 glucosyltransf erase [Homo sapiens] 
20835439 1 ref IXP 1315D6.l| similar to Dolichyl pyrophosph. . 



.489 
. 369 
.47 
. 244 
. 238 
. 236 
■ 222 
. 219 
.192 
. 112 
112 
.104 



e-137 
e-101 
4e-64 

3e-63 
2e-61 
7e-61 

9e-57 
Be-56 
le-47 
le-23 
le-23 
3e-21 



Aligxments 

S . cerevisiae 

Score = 489 bits (1259) , Eaqpect « e-137 • , , 

Identities = 274/530 (51%), Positives = 3SB/530 (67%), Gaps = 5/530 (0%) 

Query: 20 SPENSPVTOFLSPFGPAPNQWVXXXXXXXXXXXXXXSW 79 

SF SP++DFL PF P NQW+ +GW3 YSG +PP+YGDFEAQRH 

Sbjct: 16 SFYASPMTOFLYPFRPVGNQWLPEYIIFVCAVIIJRCTIGLGPYSGKGSPPLYGDFEAQRH 75 

Query: 80 WMEITQHLSIEKWYFYDLQYWGUJYPPLTAFHSYFFGKLGSFINP™^ 139 

WMEITQHI* + KWY+TOLQYWGLDYPPLTAFHSY Q +GSF NP+WFAIi+ SRGFES D 
Sbjct: 76 WMEITOHLPIiSKWYWYDIjQYWGIiDYPPLTAFHSYIJjGLIGSFFNPSWFALE 135 

Query: 140 --LKSYMRATAILSELLCFIPAVIWYCRWMGIOTTNQNAIEQT^ 197 

LK+YMR+T I+S++L + PAVI++ +W+G Y NQ+ I Q+I ASAILF PSIH.+IDH 
Sbjct: 136 NOLKTYMRST^aiSDIIiFyPPAVIYFTKWLG-RYiaTQSPIGQSIAASAILFQPSU^ 194 

Query: 19B GHFQYNSVI^FALI^II^LYDNFAIAAIFFVLSISFKQMALYYSPIMFF^ 257 

GHFQYNSVMLG +1 NLL + +A+AA+ FVLSI FKQMALYY+PI P Y+LS S 
Sbjct: 195 GHFQYNSVML^TAYAINNLIjDEyYAMAAVCFVLSICF 253 

Query: 258 LKK™iiLIUATISIA\n^TFATLLLP-FVLVDGMSQIGQILFRWPFSRGL 316 

KN+ RIi 1+ A L TFA + P + L G+ IQ + R+FPF+RG+FEDKVANFW 
Sbjct: 254 FPKFNIA!a*T\ra^ATIiATPAIIFAPLyFIjGGGIiKHIHQCim 313 

Query: 317 CTTiaLVKyKQLFTDKTLTRISLVATLIAISPSCFIIFTHPKKV^ 376 

C TH+ VKyK+ FT + L SL+AT+I P+ + HPKK I1LP+ ACS +F+LF 
Sbjct: 314 CVTNVFVKYKERFTIQQLQLYSLIATVIGFLPAMIMTIiIOT 373 

Query: 377 SFQVmTf^Tirinnnnnnrrr^ 436 

SFQVHEK+ D +++S+V WI+N+A F++WPLLK+DGL L+Y V + 

Sbjct: 374 SFQVHEKTILIPI*LPlTLLYSSTDWKWIiSLVSWIim\MjFTLWPLLK^ 433 

Query: 437 LSNWLIGNIjNWISKWLVPSFLIPGPTIiSICKVPK^ 496 

LSNWLIGN +P L P6P++S ++++ + W S +Y+ 

Sbjct: 434 IfiNWLIGNPSPITPRFIiPKSLTPGPSISSINSDYRRRSIiLPYNVWra 493 
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Query: 497 QFVDWLYLPPAKYPDIiWIIOTTLSFACFGLFWLVramnjyiliRDFm 546 

P+D PP+KYPDLWV+LN + F CF +FWLW Y :f+ +KD 
Sbjct: 494 HFIJJQFWlPPSKyPDIiWVIiIJICAVGFICFSIFWLWSYyKIFTSGSKSM 543 

5. pQznbe 

Score = 369 bits (946) , Expect = e-101 

Identities = 228/513 (44%), Positives = 315/513 (61%), Gaps = 35/513 (6%) 

Query: 21 FEN-SPVPDFIiSPPGPAPKQWVXXXXXXXXXXXXXXXVGLGSYSGEOTPPMTt^ 79 

FEN +PV F+S P ++++ + +G YSG+NTPPMTGDFEAQRH 

Sbjct: 5 FENGAPVQQFVSRFRSYSSKFIiFPPCLIMSLVFMQWIiISIGPYSC^YNTPPMym 64 

Query; 80 WMEITQHLSIEKimTOIiQyWCSLDYPPLTAFHSYPFGKIiGS-FIlin^ 138 

WME+T H + +WYF DIiQ+WGIiDYPPIiTA+ S+FFG +G P NP WFA SRGFES+ 
Sbjct: 65 Wl^TI^HTPVSQWYFRDLQVWGIjDYPPLTAYVSin^FGIIGHyF^^ 124 

Query: 139 DIiKSYMRATAILSELIiCFIPAVIWyaiWMGIiNYFNQNAIEQTIIASAI^ 198 

+LK +MR+T I S LL +P +++Y +W N +++ +IjF P+IH-+IDHG 

Sbjct: 125 ELKLFMRSTVIASHIiLILVPPLMFYSKPWSRia--PNFVDRNASLX^^ 182 

Query: 199 HFQYNSVMMFALLSILNIiLYDNFALAAIFFVIiSISFKQMALYYSPIMF 258 

HFQYN VMIjG + +1 NLL + + A FF L+++FKQMALy++P +FFY+L P 
Sbjct: 183 HFQYNCVMLGLVMYAIAinJ^QYVAATFFFa^TFKQMALOT 242 

Query: 259 KNFNLLRIiATISIAVLLTFATLLLPFVLVIXSMSQIGQILFRWPFSRGLFEDK^ 318 

F+ R +S+ V+ TF+ +L P++ +D + + QII» RVFPF+RGL+EDKVANFWCT 
Sbjct: 243 IRFS-~RFIIiIiSVTWFTFSIiILFPWIYI©YlCriiLPQILHRVFPFARGL 300 

Query: 319 THILVKnCQIiFTDKTLTRISLVATLIAISPSCFIIFTHPKKVl^ 378 

N + K +++FT L ISIH- TLI+I PSC I+F +P+K LL FA+ SW F+LFSF 
Sbjct: 301 LNTVFICTREVFTLHQLQVISLIFTLISILPSCVILFLYPRKRLLALGFASASWGFFLFSF 360 

Query: 379 QVHEKSXXXXXXXXXXXXXEKDLDIISMVCWISNIAPPSMHPLLm^ 436 

QVHEKS ++ + +N+A FS+WPLLK+DGL L+YF L ++ 

Sbjct: 361 QVHEKSVLLPLLPTSILLCHG!nTTKPWIALANNLAVFSLWPLLKKDGLGLQYFTLV^^ 420 

Query: 439 KHWLIGNLNWISKWLVPSFLIPGPTLSKKVPKRDTKIVVHTHS?^ 498 

NW IG++ SK ++ P + Y+G VI 

Sbjct: 421 NW-IGDMWPSKNVLPRP IQLSFYVGMTVIIjG 451 

Query: 499 VDWLYLPPAKYPDLWVILNTTLSPACFGLFWLW 531 

•l-D PP-l-+YPDIiWILN TliSFA P -l-LH 
Sbjct: 452 IDLFIPPPSRYPDLWILNVTLSFAGPFTIYLW 484 



D, melanomas ter 

Score = 247 bits (630) , Expect = 4e-64 

Identities = 175/490 (35%), Positives = 267/490 (54%), Gaps = 55/490 (11%) 

Query: 57 VGLGSYSGFNTPPMYGDFEAQRHWMEITQHLSIEKWYF YDLQYWGLDYPPLTAFHS 112 

+ L SYSGF++PPM+GD+EAQRHW EIT +L4.+ +WY DLQYWGLDYPPLTA+HS 
Sbjct: 19 ISLYSYSGFDSPPMHGDYEAQRHWQEITVNLAVGEWYTNSSNKDLQYWGLDYPPLTAYHS 78 

Query: 113 YFFGKLGSFINPAWFALDVSRGFESVDLKSYMRATAILSELLCFIPAVIWYCSi™ 172 

Y G++G+ I+P + L SRGFES + K +MRAT + +++L ++PA++ + + 

Sbjct: 79 YLVGRIGASIDPRFVELHKSRGFESKEHKRFMRATWSADVLIYLPAm.T.T.AYSLDKAFR 138 
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Query: 


173 


Sbjct: 


139 


Query: 


233 


Sbjct: 


196 


Query: 


291 


Sbjct: 


253 


Query: 


351 


Sbjct : 


3 J. J 


Query: 


410 


Sbjct: 


363 


Query: 


46S 


Sbjct: 


411 


Query: 


524 


Sbjct: 


457 



FIGURE 27 (sheet 3) 

NQlMEQTIIASAILPOTSLIIIDHCOTQYNSVm/SFJ^SILN^ 232 
+ + + + +A P +ID+CaiPQYN++ liQFA ++I +L F AA FF Ii+ 



+++KQM LY+S + FF L C K+F + ■»-+ 1+ VL TFA I. +P+ + + 
LimCQMELYHS-LPFFAFIJU^ECVSQKSFASFIAEISRlJ^^ -LGSL 

SQIGQIIiFRVFPFSRGLFEDKVANFWCTTHILVKyKQLPTDK^ 

+ Q+L R+FP +RG+FEDKVAN WC N++ K K+ ++ + + + TLIA P+ 



++F V A S AF+LFSFQVHEK+ + + CW 

inT.Tn?PT?TWnirt.IAI.FNTSLAFFLFSFaVHEKTILLTALPA I*FLLKCWP 3 62 



+ FSM PUj RDL+V+++ +SK I*S 
DEmLFIiEVTVFShn^LLARDEIJjVPAWATVAFHLIFKCFDSKSK IjS 410 



+ p + + + +s + A+ L +P P KyPDIiW ++ + S 

411 NEYPIiKyiANI SQIIiMISWVAS LTVPAPTKYPDLWPLIISVTSCG 455 



F LF+LW N 



A. tbaliana 

Score = 244 bits (622) , Expect « 3e-63 

Identities = 187/4BB (38%), Positives = 248/488 (50%), Gaps = 39/488 (7%) 

Query: 62 YSGFNfTPPMYGDFEAQRHWMEITQHLSIEKWY FYDLQYWGIiOyPPLTAFHSYFFGK 117 

YSG PP +GDFEAQRHWMEIT +L + WY + DIi yWGIjDYPPLTA+ SY G 
Sbjct: 61 YSGAGIPPKFGDFEAQRHWMEiraniPVIDWYRiraTYMDLTyWGm 120 

Query: 118 I,GSFINPAWFAIJ)VSRGFESVDLKSYMRATAILSEIiLCFIPAVIWyCKWMGI23^^ 177 

F NP AL SRG ES K MR T + S+ F PA +++ N 
Sbjct: 121 FLRFFNPESVAIiIiSSRGHESYLGKIiIihlRWTVIiSSDAFIFFPA;^ 180 

Query: 178 EQTIIASAILFOTSLIIIDHGHFQYNSV^^JGFALliSIIlNLLYDNPAIAAIFE^^ 237 

E + IL NP LI+IDHGHFQYN + LG + +1 +L ++ L + F L++S KQ 

Sbjct: 181 EVAWHIAMILLNPCtlLIDHGHFQYNCISLGIiTVGAIAAVLCESEVIiTCVLFSIi^^ 240 

Query: 23 8 MALYYSPIMFFYMIiSVSCOTLKNFmiliRIATISIAVIiLTFATLLLPF 297 

M+ Y++P F C K+ +Ii + + IAV++TF P+ V + +Ii 

Sbjct: 241 MSAYFAPAFFSHLIiG-KCIiRRKS-PILSVIKIjGIAVIVTF\n[FWWPY--VHSIJ^ 296 

Query: 298 FRVFPFSRGLFEDKVANFWCrnmLVKYKQLFTDKTLTRISLVAm 357 

R+ PF RG++ED VANFWCTT+IIH-K+K LFT ISL AT++A PS P 

Sbjct: 297 SRLAPFERGIYEDYVANFWCTTSILIKWKNLFTTQSLKSISLAATIIJ^ 356 

Query: 358 KKVLLPWAFAACSWAFYLFSFQVHEKSXXXXXXXXXXXXXEKDIJDIISMVOTISNI^ 417 

+ S AFYLFSFQVHEKS L + ++ A FS 

Sbjct: 357 SNEGFLYGLLNSSMAFYIiFSFQVHEKSIIjMPFIiSATLIjA LKLPDHFSHLTYYAIiFS 412 
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Query: 418 mPLLKRDGIiALEYFVIiGIIiSNWLI GNIiNWISKWLVPSFL IPGPTLSKKVPKRD 471 

M+PIiL RDL + YLL + GN+IKVF PG 
Sbjct: 413 MFPIiIiCraKLIjIPYLTLSFLFWIYHSPGNHHRIQKTDVSFFSFKHFPGyV^ 464 

Query: 472 TKTVVHTHWFWGSVTFVSYLGMVIQFVI^ 531 

++ TH+F V V YIi PP KXP L+ Ii L F+ F +F + 

Sbjct: 465 LLRTHFFISWIBVLYIiTIK PPOKXPFLFEALIMHiCFSYFIMFAFy 511 

Query: 532 rNYNLYTL 539 

NY + L 
Sbjct: 512 TNYTQWTL 519 
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FIGIIRE28 



K. lactisALG6 

ATCTCTGTTTCAACAGCTCTTGCATTCATTGGITCTTTCGGTCCAATCTATA 

TCTTTGGAGGATACAAGAACTTAGTGCAATCAATGCACAGGATTTTTCCAT 

ITGCCAGGGGTATCmGAAGATAAAGTTGCGAATTTTTGGTGCGTTTCTA 

ATATTTTCATCAAATATAGAAA.TCTATTCACTCAGAAGGATCTTCAATTAT 

ACTCATTACTCGCAACAGTTATTGGGCTTTTACCATCATTCATTATAACAT 

TTTTATACCCGAAGAGACATITACTACCATATGCTTTGGCCGCATGTTCGA 

TGTCATTCTTCTTATTCAGCTTCCAGGTTCATGAAAAGACAATCTrAlTAC 

CTTTACTTCCTATTACACTCTTGTACACGTCAAGAGATTGGAATGTTCTAT 

CATTGGTTTGTTGGATTAACAACGTGGCATTGTTTACACTCTGGCCATTAC 

TGAAAAAGGACAATCTAGTATTGCAATATGGAGTCATGTTCATGTTTAGC 

AATTGGTTGATCGGTAACTTCAGTTTCGTCACACCACGCTTCCTCCCAAAA 

TTTTTGACACCAGGGCCATCCATCAGTGATATAGATGTTGATTATAGACGG 

GCAAGTTTACTACCCAAGAGCCTAATATGGAGATTAATCATTGTTGGCTCA 

TATATTGCAATGGGGATTATTCATTTTCTAGACTATTACGTCTCCCCGCCA 

TCAAAATACCCTGATTrATGGGTGCTTGCCAATTGTTCCTTGGGCTTCTCA 

TGTTTTGTGACATTITGGATATGGAACAATTATAATTATTCGAAATGAGAA 

ACAGCACtTTGCAAGATTTA 



K. lactis Alg6p 

ISVSTALAHGSFGPIYlFGGYKNLVQSMinaFPFARGIFEDKVANFWCV'Sh^ 

YRmFTQKI)LQLYSLIATVIGLIJ>SFIITFLYPKRHIJJ'YALAACSMS^ 

VimKmiJPLLPITLLYTSRDWNVI^LVCWD^NVALFTLWPLIJ^ 

VlvdlWSNWLIGNFSFVTPIOflPOLTPGPSISDroyDYRIL^ 

GSYIAMGIIHFIJJYYVSPPSKYPDLWVIANCSIXjFSCFVTFWI^^ 

TALCKI 
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K lactis ALG6BLAST 



Score E 

Sequences producing significant alignments: (bits) Value 



Si 



31 



2i 



21 



Si 



Si 



1420090 I emb | CAA99190 . 1 1 ORF yOR002w [Saccharomyces cerev. . .392 e-108 



7490584 Ipir 



15240920 Iref 



T40396 glucosyl transferase - fission yeast . . . 1S7 2e-46 



NP 198662. l| glucosyltransf erase- like prote, . . 117 2e-25 



7019325 iref 



12002040 



19921070 



Qb 



NP 037471. 1| dolichyl-P-Glc:Man9GlcNAc2-PP-d. . . 103 2e-21 



AAG43163.1IAF063604 1 brain my046 protein [H. . .102 Be-21 



reflNP 609393.1 1 CGS091-PA [Drosophila melanoga. - .101 le-20 



Alignments 

5. cerei^siae 
Score = 392 bits (1006), Esqpect « e-lOB 

Identities ^ 182/280 (65%), Positives ^ 218/280 (77%), Gaps = 1/280 (0%) 
Frame =» +1. 

Query: 1 ISVSTAIAFIGSFGPIYIFGG-YKNLVQSMHRIFPFARGIFEDKVANFWW 177 

1+ +T F F P+Y GG KN+ Q +HiaFPFARGlFEDKVMSrFWCV+N+F+Ky+ 
Sbjct: 265 lAFATXATFAIIFAPLYFLGGGLKHlHQCIHRIFPFARGIFEDK^/TaTFWCV^ 324 

Query; 178 LFTQKDLQLYSIJATVIGLLPSFIITFLYPKRm.TiPYMJACSMSFFLFSFQVHEK^^ 357 

FT + IiQLYSL+ATVlG LP+ I+T L+PK+HLLPY L ACSMSFFLFSFQVHEK 
Sbjct: 325 RFTIQQXiQLYSLIATVIGFLPAmMTLLHPK3CHLIiPYVIiIACSl^FFLFS 3 84 

, y+S DWNVLSLV WINNVALFTLWPLLKKD L LQY V F+ SNWLIGWFSF 

Sbjct: 385 PLLPITLLySSTDWNVIiSLVSWINOTALFTLWPLLKKIXSLH^ 444 

Query: 538 VTPRFLPKFIiTPGPSISDIDVDYRiy^LLPKSLIWRLIIVGSYIAMGirHFLDyyVSPPS 717 

+TPRFLPK LTPGPSIS 1+ DYRR SLLP +++W+ I+G+YIAMQ HFLD +V+PPS 
Sbjct: 445 ITPRFLPKSLTPGPSISSINSDYRRIlSLLPYNVVWKSFIIGTyiAMGFYHFLDQFVAPPS 504 

Query: 718 KYPDLWVIiANCSLGFSCFVTFWIWNNYXLFEMRNSTLQDL 837 

KYPDLWVIj NC++GF CF FW+W+ Y +F + +++DL 
Sbjct: 505 KYPDLWVLLNCAVGFICFSIFWLWSYYKIFTSGSKSMKDL 544 



5. pojhbe 

__Score = 187 bits (475) , Expect « 2e~46 
Identities = 106/280 (37%), Positives « 150/280 (53%), Gaps = 1/280 (0%) 
Frame » +1 

Query: 1 ISVSTAZJ^IGSFGPIYIFGGYlOffLV-QSMHRIFPFARGIFEDKVANFWCVSNIFIKYra 177 

+SV+ F P +1+ YK L+ Q +HR+PPFARG++EDKVANFWC N K R 

Sbjct: 251 LSVTVVFTFSLILFP-WIYMDYKIIiLPQIIjHRVFPFARGIiWEDKVAOT 309 

Query: 178 IiFTQKDMLYSLIATVIGLLPSFIITFLypKIfflLliPYAI^ 357 

+FT LQ+ SL+ T+1 +LPS +1 FLyP++ LL A+ S FFLFSFQVHEK 
Sbjct: 310 VFTiaQI^ISLIFTLISILPSCVlIiFLYPIUaUiIiAIjGFASASWQFFLFSFQf^^ 369 
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Query: 358 XXXXXXX3mrSI©WNVIJ5LVCWINKVMjFn^ 537 

+ + HN+A+F+LWPIiLKKD L IiQY + + NW 
Sbjct: 370 PIJJ>TSIIiLC3IGNITTKPWIAliMW^ 422 

Query: 538 VTPRFLPKPLTPGPSISDIDVDYRRASLLPKSLIWRLIITGSYIAMGII^ 717 

I D+ V K++++R I + y+ M +1 +D ++ PPS 

Sbjct: 423 IGDMW FSKNVTjFRFIQLSFYVGMTVXLGIDLFrPPPS 460 

Query: 718 KYPDliWVLANCSLGFSCFVTFWIWlWYXIiFEMWa^STW 837 

+ypDIiWV+ N +L F+ P T ++W L + + DL 
Sbjct: 461 RYPDLWILNVTLSFAGFPTIYIiWTIiGRIiIjHISSKIiSTDL 500 

A. thaliana 

Score = 117 bits (292), Expect = 2e-25 

Identities = 81/240 (33%), Positives = 120/240 (50%), Gaps = 2/240 (0%) 
Frame « +1 

Query: 85 MHRXFPFARGIFEDKVANFWCVSKTIFIKYRlJIiFTQKDLQLYSIJ^ 264 

+ R+ PF RGI+ED VANFWC ++I IK++NLFT + L+ SL AT++ LPS + h 
Sbjct: 296 I,SRIJa>FERGIYEDyVAim?CrTSIl2lKWKNLFTTQSI^ 355 

Query: 265 PKRHLLPYALAACSMSFFIJ^SPQVHEKXXXXXXXXXXXXYTS^ 444 

P Y L SM+F+IiFSFQVHEK + L + ALF 
Sbjct: 356 PSNEGFLYGLLNSSflAFYIiFSPQVHEKSIIjMPFIjSATIjLALKLPDH^ ALP 411 

Query: 445 TLWPLLKKDNLVLQyCWMFMFSIWLIGNFSFVTPRFLPKFLTPG--PSIOT^ 618 

+++PLL +D Y + SF+ F + +PG +1 DV + 
Sbjct: 412 SMFPLLCRDKLLIPYLTL T--SFL PTVIYHSPGNHHAIQICroVSFFSFK 457 

Query: 619 LLPKSLIKRLIIVGSYlAMGIIHFLDYYVSPPSKYPDLWVIiANCSLGFSCFVTFWIWNOT 798 

P + L+ +I++ ++H L + PP KYP L+ L FS F+ F + NY 

Sbjct: 458 NFPCSYVF--LLRTHFFISV-VLHVLYLTIKPPQKYPFLFEALIMILCF^^ 514 

H. sapiens 
Score » 103 bits (258) , Expect = 2e-21 

Identities = 78/266 (29%) , Positives « 123/266 (46%) , Gaps = 3/266 (1%) 

VSTALAFI6SPGPiyi--FGGYKHLVQSMHRIFPPARGIFEDKVAHFWCVSNIFIK^^ 180 
V A + SF ++ P + +Q + R+FP RG+FEDKVAN WC N+P+K +++ 
VKIiACIWASFVLCm*PFFTEREQTLQVLRRLFPVDRGLFEDKVANIW 291 

FTQKDLQLYSLLATVIGLLPSFIITFLYPKRHLLPYALAACSMSFFLFSFQVHEKXXXXX 360 

+ + S T + LLP+ I L P + L +C++SFFLFSFQVHEK 

LPRHIQLIMSFCFTFLSLLPACIKLILQPSSKGFKPTLVSCALSFFLFSFQVHEKSILLV 351 

XXXXXXXYTSRDWNVLSLVCWINWALFTLWPLLKKDNLVLQYGVMFM-FSm^ 537 

+ + + W V+ F++ PLL KD L++ V M F + +FS 

SLPVCLVLS EIPFMSTWFLLVSTFSMLPLLLKDEIiIJ^PSVVTTMAFFIACVTSFSI 407 

VTPRFLPKFLTPGPSISDIDVDYRRASLLPKSLIWRLIIVGSYIAMGIIHFLDYYVSPPS 717 

+ SIS V SI+++SIM+++ +PP 

FBKTSEEELQLKS FS I S VRKYLPCFTFLSRI IQYLFLI SVITMVLLTLMTVTLDPPQ 464 

KYPDLWVLANCSLGFSCFVTFWIWNN 795 
K PDL+ + C + F+ F ++ N 
KLPDLFSVLVCFVSCLNFLFFLVYFN 490 
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FIGURE 30 
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FIGURE 31 
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BIGURE34(sheet2) 



TCATTCaAACTGAAAACAAT^CAGGAAGAGGGAATTGAGCCAATT^^ 

GGGGCCGATCCTAAACCAATTAATTTATTTATTTGGGAGGATGGGGGCG^ 

AGGGAGGAGAGGGGTTGAACAGTTTCCTTTTGTTCCTCACTGTTAATTCGCCCACCT 

TCGGGCCCTTCTTGTTCTGCAGCGCCAAGCAGGGTGCAGAGGGGCTGTGGCTTGCTT 

GAGGGGCCACTGTGGGGCTTCACTCCTGGTCACAGGTGGCAGCAG^ 

TCTATAAGCAGGGGGATGTAGCTCAGTTTGTAGAATGCTTGCATAGCATAAATGAAG 

TCCTGGGTTCCATCCCCAGCACCACATAAATGCyiGGTAAGAAA(^ 

CCAAGCATTCTCCTTGGCTACATAACAAAAGCAAGGCCTTT 

TACAAGAGACCCTATCTCAGAAAATTGTGGGGGGGAGGGGGGGGGAAATGGCCTTGA 
AAACACAGCCAGTCACTGTCACTGCATTGCCAGAACTGGTG 
GGCAGATAACAGCTAAAAGGCACATAACCTTGGTGGGGAAAT^^ 
CCTGAGGGCCCCACCAAGTTCCAAAAAAAAAAAA 



>gi 1 18997007 I gb|AAL83249.1 |AF474154_1 N- 
acetylglucosaminyltransferase V [Mus musculus] 

MAFFSPWKIJSSQKLGFFLVTFGFIWG^MLmFTIQQRTQPESSSMLR^ 
IKALAEENRDVVDGPYAGVMTAYDLKICriAVLI^ 

STNSTTAVPSLVSLEKINVADIINGVQEKCVLPPMDGYPHCEGKIKl^^ 

YADYGVDGTSCSFFIYLSEVENWCPRLPWRAKOTYEEADHNSL^ 

KKHEEFRWMRLRIRRMADAWIQAIKSLAEKQNLEKRK^^ 

ETAFSGGPLGELVQWSDLITSLYLLGHDIRISASLAELKEIl^QCKWGmSGCP 

RIVELIYIDIVGLAQFKKTIiGPSWVHYQ(M^VLDSFGTEPEFNEIASYAQSKGI^ 

WGKmnijMPQQFYTI^PHTPDNSFLGFVVEQHLNSSDIHHI^ 

FWKNKKIYLDIIHTYlffiVHATVYGSSTKNIPSYVKlffi 

lgfpyegpapleaiangcaflnpkfnppksskotdf 
grphvwtvdlni^evedavkailnqkiep™ 
qvmwpplsalqvklaepgqsckqvcqesqlicepsffqhln^ 
lykdilvpsfypkskhcvpqgdlllfscagahpthqricpcrdfik 
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